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Abstract 

This  work  considers  the  problem  of  hireling  optimal  replacement  policies  that 
minimize  the  expected  total  cost  of  maintaining  a  satellite  constellation.  The  prob¬ 
lem  is  modeled  using  discrete-time  Markov  decision  processes  to  determine  the  re¬ 
placement  policy  by  allowing  the  satellite  constellation  to  be  in  one  of  a  finite  number 
of  states  at  each  decision  epoch.  The  constellation  stochastically  transitions  at  each 
time  step  from  one  state  to  another  as  determined  by  a  set  of  transition  probabilities. 
At  each  decision  epoch,  a  decision  maker  chooses  an  action  from  a  set  of  allowable 
actions  for  the  current  system  state.  A  cost  associated  with  each  possible  action 
is  determined  by  the  number  of  satellites  purchased,  launched,  or  held  in  storage, 
as  well  as  the  operational  capability  of  the  constellation.  The  system  is  evaluated 
for  a  given  time  horizon  using  the  standard  Policy  Evaluation  Algorithm  of  Markov 
decision  processes  (stochastic  dynamic  programming)  to  determine  the  optimal  re¬ 
placement  policy  and  the  minimum  expected  total  cost.  Example  problems  using 
notional  data  are  presented  to  demonstrate  the  solution  procedures.  Sensitivity  anal¬ 
ysis  of  problem  parameters  is  performed  to  investigate  their  impact  on  the  minimum 
expected  total  cost  of  operating  the  constellation  over  a  specified  time  horizon. 
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OPTIMAL  REPLACEMENT  POLICIES 
FOR  SATELLITE  CONSTELLATIONS 


1.  Introduction 

1 . 1  Background, 

The  United  States  Air  Force  maintains  a  wide  variety  of  satellite  constellations 
used  for  such  purposes  as  navigation,  communications,  weather,  early  warning,  and 
intelligence  collection.  As  the  satellites  that  compose  these  constellations  deteriorate 
and  tend  towards  failure,  their  replacement  is  essential.  Satellites  that  have  been  in 
service  for  any  length  of  time  suffer  some  amount  of  degradation,  but  most  are  still 
useful  and  able  to  accomplish  their  mission  to  varying  degrees,  until  the  degradation 
is  substantial  enough  that  the  satellite  is  deemed  unable  to  satisfactorily  accomplish 
the  mission.  Ideally,  each  satellite  would  be  replaced  just  prior  to  its  failure  to 
prevent  degradation  of  the  constellation’s  ability  to  perform  its  mission,  while  also 
maximizing  the  useful  life  of  each  satellite.  Degrading  the  capability  of  a  constellation 
to  perform  its  mission  can  have  grave  consequences  to  national  security,  especially 
considering  the  mission  of  the  constellation,  the  state  of  world  affairs,  and  military 
actions  in  progress. 

One  way  to  ensure  a  replacement  is  available  in  the  event  of  satellite  failure 
is  to  maintain  spare  satellites  on-orbit.  However,  satellites  are  extremely  expensive 
to  build  and  launch  making  this  approach  unrealistic  except  for  the  most  critical 
assets.  The  financial  cost  of  maintaining  a  satellite  constellation  must  be  weighed 
against  the  “cost”  in  terms  of  national  security  to  have  a  loss  in  mission  capability 
due  to  an  on-orbit  failure  of  a  satellite  when  no  replacement  is  available.  It  is 
therefore  desirable  to  prescribe  a  satellite  replacement  policy  which  balances  the 
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need  to  maintain  constellation  mission  capabilities  and  avoid  undue  replacement 
costs  by  minimizing  the  monetary  and  security  costs  to  the  nation  over  the  lifetime 
of  the  satellite  system. 

In  a  national  defense  context,  finding  such  an  optimal  replacement  policy  is 
important  to  the  United  States  government  and  its  citizens  because  the  policy  min¬ 
imizes  the  cost  of  providing  for  the  security  of  the  nation.  Optimal  satellite  re¬ 
placement  policies  are  also  important  to  companies  in  the  private  sector,  such  as 
telecommunications  or  satellite  television  providers.  While  a  company  may  not  be 
directly  concerned  with  national  security,  customer  satisfaction  and  good  will  are 
important  performance  metrics  for  the  survival  of  any  company.  Competitive  firms 
desire  to  minimize  the  overall  costs  while  maintaining  a  high  level  of  customer  sat¬ 
isfaction. 

The  money  saved  from  implementing  such  policies  can  be  used  to  make  im¬ 
provements  to  the  satellite  constellation’s  capabilities  or  robustness,  or  the  money 
could  be  directed  to  different  projects  altogether,  making  some  previously  unfunded 
or  underfunded  programs  possible.  Monetary  savings  on  government  systems  could 
also  be  passed  along  to  taxpayers  in  the  form  of  tax  breaks.  In  the  case  of  a  private 
firm,  savings  can  be  passed  to  investors  in  the  form  of  dividends  or  to  customers  in 
the  form  of  lower  prices,  creating  a  competitive  advantage  for  the  firm. 

Research  in  the  areas  of  optimal  replacement  policies  and  the  modeling  of  satel¬ 
lite  constellations  are  relevant  to  the  problems  presented  here.  Optimal  replacement 
problems  have  been  studied  extensively  in  the  stochastic  operations  research  liter¬ 
ature.  Solutions  for  general  mechanical  or  electrical  systems  are  often  found  using 
renewal  theory.  Open  source  literature  on  the  modeling  of  satellite  constellations, 
however,  is  much  more  sparse.  Only  a  few  journal  articles  have  been  published 
addressing  different  approaches  to  modeling  satellite  constellations. 

Problems  involving  the  modeling  of  satellite  constellations  generally  use  Monte- 
Carlo  simulation  to  determine  actions  for  maintaining  constellations,  predicting 
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satellite  reliability,  or  assessing  satellite  constellation  availability  [15]  [21]  [23].  The 
models  themselves  are  sometimes  set  up  purely  as  simulation  models  based  on  satel¬ 
lite  failure  times  or  the  satellite  constellations  are  modeled  using  a  network  of  queues. 
In  both  cases,  Monte-Carlo  simulation  is  generally  used  to  analyze  the  model  due  to 
issues  of  complexity.  When  using  Monte-Carlo  simulation,  one  run  of  the  simulation 
results  in  one  data  point  of  information,  or  one  possible  outcome,  of  all  the  possible 
outcomes  that  could  occur  in  the  experiment.  To  appropriately  interpret  the  results, 
the  experiment  must  be  replicated  numerous  times  and  the  results  subjected  to  care¬ 
ful  statistical  analysis  to  make  the  correct  inferences  regarding  system  performance. 
Monte-Carlo  simulation  can  be  used  to  compare  two  or  more  models,  but  does  not 
allow  the  user  to  determine  if  an  evaluated  model  is  optimal.  For  this  reason,  an 
analytical  solution  is  preferred  because  such  a  solution  can  be  shown  to  be  optimal. 

This  thesis  approaches  the  problem  of  finding  an  optimal  satellite  replacement 
policy  from  an  analytical  point  of  view.  The  specific  approach  taken  to  solve  the 
problem  is  to  analytically  model  satellite  constellations  and  then  use  Markov  de¬ 
cision  processes  to  determine  the  optimal  replacement  policy  of  the  satellites.  In 
this  context,  optimal  means  the  minimum  expected  total  cost  over  the  time-horizon 
evaluated.  Using  Markov  decision  processes  to  model  the  stochastic  evolution  of 
a  deteriorating  satellite  constellation  is  useful  because,  for  the  given  inputs,  they 
provide  a  provably  optimal  replacement  policy  under  certain  assumptions.  Such  an 
optimal  policy  avoids  a  potential  source  of  error  because  there  is  no  need  to  make 
inferences  from  the  model  outputs  regarding  the  optimality  of  the  solution. 

1.2  Problem  Definition  and  Methodology 

In  this  thesis,  a  satellite  constellation  is  stochastically  modeled  and  analyzed 
to  find  a  satellite  replacement  policy  which  minimizes  the  expected  monetary  and 
opportunity  costs  (e.g.  national  security  costs  and  costs  associated  with  gain  or  loss 
of  customer  satisfaction)  of  maintaining  the  constellation.  While  certain  budgetary 
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constraints  may  be  imposed  on  the  implementation  of  a  policy  (e.g.  a  maximum 
annual  budget)  the  optimal  policy  found  herein  does  not  take  into  account  budgetary 
constraints  and  is  useful  for  establishing  budgets  or  lobbying  for  funding  levels  that 
minimize  costs  over  the  lifetime  of  the  system. 

The  research  objectives  of  this  thesis  are  to  analytically  model  satellite  constel¬ 
lations,  to  find  optimal  satellite  replacement  policies  for  maintaining  constellations, 
and  to  study  how  changes  to  model  parameters  affect  the  minimum  expected  to¬ 
tal  cost  of  maintaining  a  constellation.  The  satellite  constellations  are  analytically 
modeled  using  stochastic  processes,  specifically  discrete-time  Markov  chains  in  the 
context  of  Markov  decision  processes.  Optimal  replacement  policies  are  found  by 
using  the  policy  evaluation  algorithm  of  Markov  decision  processes.  Sensitivity  anal¬ 
ysis  is  performed  to  investigate  the  impact  of  model  parameters  on  the  minimum 
expected  total  cost  of  maintaining  a  constellation. 

The  proposed  models  are  created  using  finite-horizon  Markov  decision  pro¬ 
cesses.  The  optimal  replacement  policy  for  minimizing  the  many  costs  associated 
with  maintaining  a  satellite  constellation  is  found  by  using  a  policy  evaluation  algo¬ 
rithm.  The  policy  obtained  using  this  technique  is  optimal  for  minimizing  the  total 
expected  cost  of  maintaining  the  constellation,  subject  to  the  imposed  assumptions. 
The  value  of  the  minimum  expected  total  cost  is  also  provided  by  the  policy  evalu¬ 
ation  algorithm  and  is  the  same  value  that  can  be  derived  by  solving  the  optimality 
equations  along  with  the  boundary  condition.  The  fact  that  the  value  from  the  pol¬ 
icy  evaluation  algorithm  agrees  with  the  results  from  the  optimality  equations  shows 
that  the  policy  evaluation  algorithm  does  indeed  result  in  the  optimal  value  of  the 
replacement  problem. 

The  main  contribution  of  this  research  is  to  provide  a  sound  foundation  upon 
which  a  more  detailed  analytical  analysis  can  be  based  in  the  future.  Establishing 
an  analytical  model  for  satellite  replacement  is  a  significant  contribution  because  it 
allows  optimal  replacement  policies  to  be  found  under  some  mild  problem  assump- 
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tions.  Moreover,  the  analytical  approach  to  the  problem  circumvents  the  need  for 
costly  and  time-consuming  simulation  studies.  The  resulting  policies  are  also  use¬ 
ful  in  determining  budget  inputs  because  the  expected  cost  per  time  period  can  be 
determined.  This  thesis  also  provides  a  means  in  which  sensitivity  analysis  may  be 
easily  performed. 

1 . 3  Thesis  Outline 

Chapter  2  provides  a  review  of  the  literature  addressing  similar  problems. 
Chapter  3  provides  an  overview  of  Markov  decision  processes  and  presents  condi¬ 
tions  for  the  existence  of  an  optimal  replacement  policy  over  a  finite  time  horizon. 
Chapter  4  presents  notional  numerical  examples  demonstrating  the  applicability  of 
these  models.  Chapter  5  reviews  the  contributions  and  limitations  of  this  work  and 
discusses  recommendations  and  future  research  directions. 
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2.  Literature  Review 


There  are  two  main  areas  of  literature  that  are  relevant  to  this  problem.  The 
first  area  is  that  of  optimal  replacement  for  degrading  systems.  Optimal  replace¬ 
ment  problems  have  been  studied  a  great  deal  in  the  stochastic  operations  research 
literature.  The  second  area  is  the  mathematical  modeling  of  satellite  constellations. 
Three  specific  models  will  be  reviewed  in  detail. 

2.1  Optimal  Replacement  Models 

Survey  papers  by  McCall  [27],  Pierskalla  and  Voelker  [34],  and  Valdez-Flores 
and  Feldman  [38]  review  the  optimal  replacement  literature  from  the  early  1950s 
through  the  late  1980s.  These  papers  provide  a  good  summary  of  optimal  replace¬ 
ment  literature  and  models  through  that  time. 

A  seminal  work  by  Barlow  and  Proschan  [6]  introduces  many  single-unit  mod¬ 
els.  Much  of  the  later  literature  is  based  on  this  work.  Barlow  and  Proschan  [6] 
describe  an  array  of  optimal  maintenance  policies,  including  age  replacement  models 
which  assume  that  spares  are  always  available.  Nakagawa  and  Osaki  [32]  extend  this 
model  to  allow  for  the  case  where  a  spare  is  not  always  available.  They  model  the 
lead  time  required  to  obtain  a  replacement  as  a  random  variable.  In  their  model, 
Nakagawa  and  Osaki  order  the  new  spare  immediately  after  each  replacement.  By 
ordering  the  spare  immediately,  the  spare  may  arrive  well  before  the  replacement 
takes  place.  Storing  the  spare  from  the  time  it  arrives  until  the  replacement  takes 
place  results  in  a  holding  cost  which  can  be  quite  expensive.  It,  therefore,  may  be 
better  to  delay  ordering  the  replacement.  Mine  and  Kawai  [28]  address  the  case  of 
delaying  the  order  of  the  replacement  to  minimize  the  holding  cost. 

Barlow  and  Proschan  ([6],  page  18),  also  present  another  topic  that  is  important 
to  the  development  of  the  model  herein.  They  define  interval  reliability  as  the 
probability  that  the  system  will  continue  to  operate  from  some  time  to  some  specified 
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time  in  the  future.  The  difference  in  times  is  the  interval.  Interval  reliability  is  an 
important  concept  to  the  satellite  replacement  models  presented  in  Chapter  3. 

Mine  and  Nakagawa  [29]  also  do  work  on  interval  reliability,  specifically  when 
the  distribution  is  exponential.  They  use  a  renewal  theory  approach  to  find  a  pre¬ 
ventative  maintenance  policy  that  maximizes  the  interval  reliability  of  the  system 
under  evaluation.  In  this  work,  maintenance  or  a  repair  can  be  considered  analogous 
to  a  replacement  in  the  satellite  problem.  This  is  allowed  because  the  authors  use  a 
repair  to  correct  a  system  failure  in  the  same  way  that  a  replacement  would  correct 
a  failure. 

Aven  and  Bergman  [3],  Dekker  [13],  and  Aven  and  Dekker  [4]  present  work  on 
a  general  structure  for  optimal  replacement  problems.  These  general  frameworks  are 
based  on  the  application  of  renewal  theory.  The  goal  in  this  approach  is  to  determine 
the  optimal  replacement  time  with  which  to  optimize  the  expected  total  cost  of  the 
system. 

Aven  and  Bergman  [3]  formally  describe  both  a  continuous-time  and  a  discrete¬ 
time  structure.  They  claim  that  their  structure  can  be  applied  to  a  large  class  of 
replacement  models.  The  authors  present  a  general  approach  to  minimizing  the  ex¬ 
pected  total  discounted  cost  as  well  as  the  long-run  expected  average  cost  per  unit 
time.  This  general  approach  involves  conditions  and  assumptions  that  are  indepen¬ 
dent  of  specific  problems.  Both  the  continuous-time  and  the  discrete-time  frame¬ 
works  are  developed  by  thorough  definitions  of  the  probability  space  and  character¬ 
istics  of  the  applicable  measure  processes,  such  as  the  failure  and  repair /replacement 
processes.  A  derivation  of  the  optimal  stopping  time  is  provided. 

Dekker  [13]  deals  largely  with  maintenance  activities  and  allows  penalty  cost 
functions  to  be  derived  for  deviating  from  the  optimal  maintenance  interval.  The 
author  claims  that  the  penalty  costs  can  be  used  to  set  priorities  for  action  selection. 
He  provides  penalty  functions  for  short-term,  long-term,  and  permanent  shifts  from 
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the  optimal  policy.  It  is  also  claimed  that  the  penalty  costs  can  assist  with  production 
planning. 

Aven  and  Dekker  [4]  extend  the  types  of  models  addressed  by  Dekker  [13]. 
This  paper  is  also  based  on  renewal  theory.  The  authors  state  that  the  framework 
presented  in  this  paper  is  a  simpler  version  of  the  framework  presented  in  [3].  After 
presenting  their  general  framework  and  assumptions,  the  authors  offer  examples 
of  how  to  apply  the  framework  to  several  types  of  problems.  Some  of  the  problems 
addressed  are  opportunity-based  age  replacement  problems,  opportunity-based  block 
replacement  problems,  and  minimal  repair  models. 

The  literature  discussed  thus  far  primarily  deals  with  single-unit  systems. 
These  problems  were  largely  addressed  by  the  use  of  renewal  theory.  The  body  of  lit¬ 
erature  on  multi-unit  models,  although  not  as  developed  as  literature  on  single-unit 
models,  has  been  growing  since  the  mid-1980’s.  Analytical  modeling  of  multi-unit 
systems  typically  relies  on  the  application  of  dynamic  programming.  Literature  on 
optimal  replacement  policies  of  multi-unit  non-repairable  systems  is  limited,  possibly 
due  to  the  dimensionality  of  the  state  space  for  such  problems.  The  following  articles 
address  multi-unit  optimal  replacement  models. 

Ben-Ari  and  Gal  [7]  and  Gal  [20]  present  a  multi-unit  model  for  which  an 
optimal  replacement  policy  is  found.  The  model  is  complicated  by  the  fact  that  there 
is  an  interaction  between  the  items  in  the  system.  Gal  [20]  gives,  as  an  example  of 
this  type  of  interaction,  the  case  in  which  a  lead  time  for  a  replacement  order  is 
incorporated.  Ben-Ari  and  Gal  [7]  use  a  dynamic  programming  approach  to  find 
an  optimal  replacement  policy.  To  circumvent  the  state  space  explosion  of  such 
problems,  the  method  combines  computer  simulation  and  dynamic  programming. 
This  method  is  called  the  Parameter  Iteration  Method  to  differentiate  it  from  the 
Value  Iteration  Method  commonly  used  in  dynamic  programming. 

While  Ben-Ari  and  Gal  [7]  present  an  application  of  the  Parameter  Iteration 
Method,  the  focus  of  Gal  [20]  is  the  method  itself.  The  Parameter  Iteration  Method 
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method  was  first  introduced  by  Gal  [19].  The  method  is  applied  when,  at  each 
time  period  t,  the  optimal  value  (see  Section  3.1  for  a  review  of  Markov  decision 
processes/stochastic  dynamic  programming)  is  approximated  by  a  function,  from  a 
user-determined  set  of  admissible  functions,  that  depends  on  parameters  of  the  state 
variables  [20]. 

This  optimal  return  function  is  evaluated,  for  each  time  period,  by  performing 
dynamic  programming  recursions  at  enough  states  to  determine  the  function.  Gal 
[20]  claims  that,  for  Markov  decision  problems  with  a  small  amount  of  uncertainty, 
the  parameter  iteration  fits  well  with  the  use  of  simulation.  Furthermore,  he  states 
that  a  policy  considered  to  be  “reasonable”  is  used  to  simulate  the  sequence  of 
states  followed  by  some  number  of  realizations  of  the  policy.  The  return  function  is 
then  approximated  by  one  of  the  admissible  functions  for  the  states  visited  by  these 
realizations.  The  approximations  are  accomplished  by  beginning  in  the  final  time 
period  being  evaluated  and  working  backwards  toward  the  initial  time  period.  Each 
iteration  of  the  this  process  results  in  a  new  policy  that  is  an  improvement  over  the 
previous  policy.  New  realizations  are  then  simulated  using  the  new  policy  and  the 
process  is  repeated  until  the  return  function  converges. 

Gal  [19]  points  out  the  Parameter  Iteration  Method  is  not  automatic  and  re¬ 
quires  that  the  user  have  a  good  understanding  of  the  system  being  evaluated  in 
order  to  determine  the  class  of  admissible  functions  for  the  return  function.  Ben-Ari 
and  Gal  [7]  refer  to  this  approximation  to  the  optimal  return  policy  as  a  practical 
solution  to  the  problem. 

Flynn,  et  al.  [17]  present  an  optimal  replacement  model  for  a  multi-component 
reliability  system.  The  goal  of  their  model  is  to  find  the  optimal  balance  between 
the  cost  of  component  replacement  and  the  cost  of  system  failure.  At  the  beginning 
of  each  time  period,  a  decision  is  made  whether  to  replace  any  failed  components. 
Replacement  components  are  assumed  to  always  be  available.  The  problem  is  for¬ 
mulated  as  a  stochastic  dynamic  program  (Markov  decision  process).  To  address 


2-4 


the  problem  of  state  space  explosion,  the  authors  restrict  their  attention  to  critical 
component  policies  which  allow  the  replacement  of  a  system  component  only  if  the 
component  has  failed  and  is  considered  to  be  critical  for  the  operation  of  the  system. 
Their  model  assumes  that  the  components  are  either  operational  or  failed  and  evalu¬ 
ate  the  system  over  an  infinite  time  horizon.  The  model  presented  by  these  authors 
is  a  multi-unit  model  that  does  not  mandate  the  replacement  of  failed  components. 
In  this  thesis,  a  constellation  is  analogous  to  the  system  and  a  satellite  is  analogous 
to  a  component.  The  models  in  Chapter  3,  however,  do  not  assume  that  the  sys¬ 
tem  (constellation)  itself  fails  when  a  component  (satellite)  is  failed,  but  instead  a 
penalty  cost  is  charged  whenever  components  of  the  systems  are  failed,  and  thus, 
performance  is  degraded. 

Chung  and  Flynn  [10]  extend  their  earlier  study  to  find  optimal  replacement 
policies  for  h-out-of-n  systems.  A  h-out-of-n  system  is  one  which  consists  of  n  com¬ 
ponents  and  requires  at  least  k  of  the  components  to  be  operational  for  the  system 
to  function.  This  paper  uses  the  same  assumptions  as  Flynn  et  al  [17]  expect  the 
problem  is  extended  to  find  the  optimal  replacement  policy  when  h-out-of-n  inde¬ 
pendent  components  must  function  for  the  system  to  be  operational.  This  optimal 
replacement  policy  is  found  using  a  dynamic  programming  formulation. 

Chung  and  Flynn  [11]  improve  on  that  work  by  presenting  a  more  efficient 
branch-and-bound  algorithm  that  finds  optimal  replacement  polices  for  h-out-of-n 
systems.  Flynn  and  Chung  [18]  continue  their  work  in  this  area  by  developing  a 
branch-and-bound  technique  for  consecutive  h-out-of-n  systems.  For  a  consecutive 
h-out-of-n  system  at  least  k  consecutive  components  must  be  operational  (the  oper¬ 
ational  components  are  not  separated  by  any  failed  components)  for  the  system  to 
function. 
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2.2  Satellite  Constellation  Models 


Jacobs  et  al.  [23]  explain  the  software  program  Operational  Constellation 
Availability  and  Reliability  Simulation  (OSCARS).  OSCARS  is  used  by  Air  Force 
Space  Command  to  analyze  and  compare  satellite  constellation  replenishment  strate¬ 
gies  (also  known  as  policies).  OSCARS  uses  Monte-Carlo  simulation  to  estimate 
when  satellites  need  to  be  launched  to  maintain  a  specified  number  of  operational 
satellites. 

OSCARS  has  two  main  functions:  Generate  Launch  Schedule  and  Evaluate 
Launch  Schedule.  The  Generate  Launch  Schedule  function  generates  a  launch  sched¬ 
ule  based  on  data  from  several  databases  containing  information  about  existing  satel¬ 
lites,  planned  launches,  and  the  inventory  levels  of  both  replacement  satellites  and 
the  boosters  needed  to  launch  them.  This  function  identifies  how  many  satellites 
need  to  be  launched  and  when  they  should  be  launched  to  maintain  a  constellation 
with  the  specified  number  of  operational  satellites.  The  Evaluate  Launch  Schedule 
function  evaluates  a  generated  launch  schedule  to  determine  how  many  operational 
satellites  are  available,  at  any  specified  time,  when  following  the  schedule  produced 
by  the  Generate  Launch  Schedule  function.  The  number  of  operational  satellites 
maintained  by  following  the  generated  schedule  is  compared  to  the  required  number 
of  operational  satellites  to  determine  the  performance  of  the  schedule  being  evalu¬ 
ated. 

The  events  in  OSCARS  are  driven  by  satellite  failures.  OSCARS  uses  satellite 
failure  distributions  specified  by  the  user  to  estimate  when  a  satellite  failure  will 
occur.  The  satellite  failure  distribution  can  be  represented  by  a  single  probability 
distribution  or  it  can  be  modeled  by  allowing  separate  probability  distributions  to 
represent  the  phases  (Infant  Mortality,  Useful  Life,  and  Wearout)  of  the  satellite’s 
lifetime.  The  Infant  Mortality  phase  is  modeled  by  a  user  prescribed  probability  of 
infant  mortality.  According  to  Jacobs  et  al.  [23]  the  infant  mortality  “represents  the 
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percent  of  time  that  the  satellite  will  fail  between  its  launch  date  and  the  end  of  the 
first  month  of  operation.1' 

The  Useful  Life  phase  of  the  failure  distribution  is  modeled  by  the  Weibull 
distribution.  The  age  of  an  existing  satellite  at  the  beginning  of  the  simulation 
is  taken  into  account  by  OSCARS.  This  age  determines  where  the  satellite  is  in 
relation  to  the  failure  curves.  Jacobs  et  al.  [23]  state  that  “The  Weibull  distribution 
has  historically  been  selected  to  model  satellite  failures.” 

The  Wearout  phase  is  modeled  in  four  different  ways.  The  first  way  is  described 
as  a  fixed  cutoff  or  cliff  where  the  satellite  failure  occurs  either  before  or  at  a  specified 
date.  The  other  ways  the  Wearout  phase  is  implemented  includes  the  use  of  the 
Rayleigh  distribution,  the  normal  distribution,  and  the  normal  distribution  with  a 
fixed  cutoff  date,  which  is  a  combination  of  the  normal  distribution  and  the  fixed 
cutoff  methods. 

Outputs  from  OSCARS  are  divided  by  the  function  that  produced  them.  The 
main  output  of  the  Generate  Launch  Schedule  function  is  Launch  Need  Date.  A 
Launch  Need  Date  is  a  date  such  that,  in  a  specified  percentage  of  the  simulation 
replications  a  launch  was  required  by  that  date.  For  example,  if  in  10  percent  of  the 
replications,  a  launch  was  required  by  some  date,  that  date  would  be  a  10  percent 
Launch  Need  Date.  The  Generate  Launch  Schedule  function  also  produces  statistics 
on  satellite  and  booster  inventory  demands.  The  purpose  of  the  Evaluate  Launch 
Schedule  function  is  to  determine  how  many  operational  satellites  will  be  available 
during  the  period  of  time  covered  by  the  simulation.  The  main  outputs  of  this 
function  are  a  graph  of  the  median  number  of  operational  satellites  available  during 
the  simulation  and  a  graph  of  the  probability  of  having  at  least  the  required  number 
of  satellites  at  any  given  time  during  the  simulation. 

Hansen  [21]  makes  reliability  predictions  for  satellite  constellations  by  focusing 
on  satellite  subsystems.  Hansen  states  that  reliability  for  electronic  components  is 
normally  defined  as  a  probability  of  “success”,  or  the  probability  that  the  system 
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will  perform  its  intended  function  for  some  given  period  of  time.  He  then  raises 
the  point  that  the  definition  of  a  success  needs  to  be  clarified  when  talking  about 
satellite  constellation  reliability.  Satellites  are  built  with  many  redundancies  because 
most  satellites  cannot  be  repaired  once  they  are  launched.  This  makes  the  definition 
of  success  more  complicated.  Should  a  success  be  when  all  components  are  working 
or  a  certain  function  is  being  accomplished?  In  this  thesis,  reliabilities  consider  the 
functionality  of  a  system.  It  is  left  to  the  reader  to  determine  what  method  is  most 
appropriate  for  their  purposes. 

Hansen  [21]  discusses  the  assumption  adopted  from  MIL-HDBK-217  ,  that  the 
lifetime  of  all  satellite  subsystem  components  are  distributed  exponentially,  is  also 
discussed.  Hansen  [21]  claims  this  assumption  does  not  accurately  represent  the 
actual  reliability  of  the  components.  Instead  he  offers  a  five  parameter  distribution 
that  is  a  linear  combination  of  an  exponential  distribution  used  to  model  infant 
mortality  of  a  component  and  a  three  parameter  Weibull  distribution  that  is  used 
to  model  the  remaining  lifetime  of  the  component.  Hansen  [21]  claims  this  five  pa¬ 
rameter  distribution  is  a  realistic  alternative  to  the  exponential  component  lifetimes 
assumed  above. 

Hansen  [21]  also  provides  an  example  of  the  redundancy  measures  for  a  sub¬ 
system  of  a  satellite.  He  claims  that  analytically  determining  the  reliability  of  sub¬ 
system  functions  is  extremely  complicated,  if  not  impossible,  and  uses  Monte-Carlo 
simulation  to  carry  out  the  analysis  of  subsystem  reliabilities. 

Ereau  and  Saleman  [15]  study  the  availability  of  satellite  constellations  by 
modeling  the  constellations  using  stochastic  Petri  nets.  The  authors  claim  that 
availability  analysis  during  the  development  phase  of  a  satellite  constellation  provides 
important  information  that  can  be  used  for  system  definition,  such  as  determining 
optimal  placement  of  the  satellites  and  maintenance  strategies.  They  also  claim  that 
availability  analysis  helps  minimize  global  costs.  The  authors  state  that  the  use  of 
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Petri  nets  is  better  able  to  handle  the  combinatoric  explosion  of  the  number  of  states 
than  other  types  of  models. 

The  authors  state  that  classic  methods,  such  as  Reliability  Block  Diagrams  and 
Fault  Trees,  are  good  at  representing  dependency  links  between  system  components, 
but  are  poor  for  modeling  complex  processes,  such  as  resource  sharing.  They  go  on  to 
say  that  Markov  chains  can  be  used  to  model  any  type  of  finite-state  process  by  com¬ 
pletely  enumerating  the  system  states.  They  claim  that,  for  satellite  constellations, 
the  state  space  grows  quickly  and  this  method  become  intractable.  This  problem  is 
handled  in  Chapter  3  by  making  reasonable  assumptions  to  limit  the  state  space  of 
the  systems  being  studied.  The  speed  and  memory  capabilities  of  modern  computers 
helps  minimize  this  concern.  Of  course,  it  is  always  possible  to  make  a  model  that  is 
too  large  for  current  computer  capacities.  For  most  reasonably  sized  constellations 
(constellations  with  as  many  satellites  as  Iridium  or  the  proposed  Teledesic  system 
are  most  likely  beyond  the  range  of  reasonably  sized),  the  state  space  of  the  problem 
can  be  held  in  check  by  these  assumptions. 

The  model  of  Ereau  and  Saleman  [15]  is  based  on  a  Low  Earth  Orbit  (LEO) 
constellation  with  p  orbital  planes  that  have  k  out  of  n  satellites  each.  The  system 
has  both  a  space  segment  and  ground  logistic  support  segment.  The  ground  segment 
has  c  independent  production  lines  that  each  produce  k  satellites.  The  ground  seg¬ 
ment  also  has  capacity  to  store  s  sets  of  one  launcher  and  k  satellites.  There  are  l 
independent  launching  areas  used  to  launch  the  satellites.  The  space  segment  allows 
both  nominal  (active)  and  standby  satellites  to  be  in  orbit. 

The  model  is  considered  to  begin  with  no  satellites  in  orbit  and  undergoes 
an  initialization  phase  to  get  the  satellites  into  orbit.  Each  launcher  is  assumed 
to  launch  k  satellites  and  so  it  takes  at  least  p  launches  to  populate  each  orbital 
plane.  In  the  event  of  a  launch  failure,  a  new  launcher  and  set  of  k  satellites  must  be 
ordered,  thus  delaying  the  completion  of  the  initialization  phase.  During  this  phase 
no  standby  satellites  are  launched  into  orbit. 
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To  simplify  the  modeling  process,  Ereau  and  Saleman  [15]  use  the  same  model 
for  initializing  the  system  as  well  as  maintaining  the  constellation.  This  results  in  k 
satellites  being  launched  in  every  replacement  launch.  When  the  first  satellite  fails, 
k  satellites  are  launched  into  that  orbital  plane.  One  of  the  satellites  will  serve  as  a 
replacement  while  the  other  k  —  1  satellites  are  put  into  standby  mode.  As  satellites 
in  that  orbital  plane  continue  to  fail,  the  standby  satellites  are  activated  to  replace 
the  failed  satellites.  When  no  more  standby  satellites  remain  in  that  plane,  the  next 
satellite  failure  of  the  plane  will  result  in  k  more  satellites  being  launched  into  that 
plane.  While  satellites  are  on  orbit  in  standby  mode  they  are  subject  to  a  satellite 
failure  rate  that  is  lower  than  the  failure  rate  of  the  active  satellites. 

The  model  is  implemented  by  using  a  global  Petri  net  made  up  of  smaller  Petri 
nets  for  the  different  model  segments.  For  example,  there  is  a  separate  network  for 
each  orbital  plane.  These  individual  networks  model  each  state  and  transition  that 
can  occur  to  the  satellites.  The  ground  segment  is  also  made  up  of  its  own  network. 
Another  network  takes  care  of  interfacing  the  ground  and  space  segment  networks. 

Ereau  and  Saleman  [15]  explain,  that  to  be  used  for  quantitative  analysis,  Petri 
nets  must  be  extended  to  incorporate  the  use  of  time.  This  extension  of  Petri  nets  is 
called  Stochastic  Timed  Petri  Nets.  Analytical  results  with  this  type  of  network  are 
possible,  but  the  issue  of  state  space  explosion  still  exists.  The  authors  state  that 
if  analytical  methods  were  used,  the  p  orbit  plane  models  would  have  more  than 
160,000  states  for  an  example  that  has  three  orbital  planes  each  with  two  active 
satellites  and  a  slot  for  a  standby  satellite.  By  comparison,  the  model  presented  in 
Section  3.4  would  have  4,  608  states  for  a  constellation  with  nine  satellites. 

Thus,  in  order  to  obtain  qualitative  results  for  their  model,  the  authors  resort 
to  Monte-Carlo  simulation.  The  authors  claim  that  their  symbolic  modeling  of  the 
satellite  constellation  with  the  Petri  nets  allows  broad  sensitivity  analysis  for  the 
input  parameters  without  having  to  change  the  model.  Ereau  and  Saleman  [15] 
conclude  by  providing  examples  of  outputs  from  their  model.  The  Central  Limit 
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Theorem  is  applied  to  determine  confidence  intervals  for  mission  availability  over 
the  lifetime  of  the  mission.  They  also  present  a  figure  showing  the  probability  that 
a  given  number  of  satellites  are  ordered  during  the  evaluated  time  period. 

The  satellite  replacement  problem  presented  in  this  thesis  is  a  multi-unit  system 
with  stochastically  deteriorating  components.  Under  normal  conditions,  satellites  are 
not  repairable  and  must  be  replaced.  A  large  portion  of  the  optimal  replacement 
literature  deals  with  determining  maintenance  times  and  finally  replacement  of  re¬ 
pairable  systems  in  order  to  minimize  cost  or  maximize  availability.  Much  of  this  lit¬ 
erature  cannot  be  directly  applied  to  finding  replacement  policies  for  non-repairable 
systems.  However,  several  models  from  the  optimal  replacement  literature  were  re¬ 
viewed  that  could  be  applied  to  satellites.  Much  of  this  literature  concerned  the  use 
of  renewal  theory  and  was  more  closely  aligned  with  the  modeling  of  a  single  satellite 
system.  Literature  regarding  the  optimal  replacement  policy  for  multi-unit  systems 
typically  used  dynamic  programming  to  evaluate  the  systems. 

The  work  by  Gal  [19],  Ben-Ari  and  Gal  [7],  and  Gal  [20]  addressed  multi-unit 
systems,  but  found  only  an  approximation  to  the  optimal  policy.  The  goal  of  this 
research  is  to  analytically  model  the  satellite  constellations  and  to  find  a  provably 
optimal  replacement  policy  to  minimize  the  expected  total  cost  of  maintaining  the 
constellation.  Flynn  et  al.  [17],  Chung  and  Flynn  ([10],  [11])  and  Flynn  and  Chung 
[18]  simplify  the  system  by  only  considering  the  critical  components.  They  assume 
that  if  any  of  these  critical  components  fail,  the  system  also  fails.  This  differs  from 
the  problem  addressed  here  in  that,  when  an  active  satellite  fails,  the  constellation 
is  degraded,  but  still  capable  of  providing  some  usefulness  so  long  as  at  least  one 
satellite  remains  operational. 

The  work  by  Jacobs  et  al.  [23]  and  Ereau  and  Saleman  [15]  model  satellite 
constellations.  Hansen  [21]  addresses  modeling  satellite  subsystems  which  has  some 
similarities  to  the  modeling  of  satellite  constellations.  All  three  of  these  articles  use 
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Monte-Carlo  simulation  to  analyze  their  models.  A  provably  optimal  replacement 
policy  cannot  be  found  by  this  method. 

This  thesis  uses  Markov  decision  processes  (which  are  solved  as  stochastic  dy¬ 
namic  programming  problems)  to  provide  a  general  analytical  model  to  find  optimal 
replacement  policies  for  satellite  constellations  which  minimize  the  expected  total 
cost  of  maintaining  the  constellation.  Markov  decision  processes  are  used  to  solve 
the  models  because  the  provably  optimal  replacement  policy  can  be  found  in  this 
way.  Moreover,  the  resulting  policy  vector  can  be  easily  implemented  by  a  deci¬ 
sion  maker.  A  review  of  Markov  decision  processes  is  provided  at  the  beginning  of 
Chapter  3. 
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3.  Formal  Model  Description 

The  problem,  as  discussed  in  Chapter  1,  is  to  find  the  optimal  replacement 
policy  which  minimizes  the  expected  total  cost  (monetary  and  opportunity  costs) 
of  maintaining  a  satellite  constellation.  A  policy  meeting  these  criteria  balances 
the  need  to  have  a  fully  operational  satellite  constellation,  capable  of  fulfilling  its 
intended  mission,  with  the  need  to  limit  the  funding  required  to  maintain  the  con¬ 
stellation.  The  optimal  replacement  policy  can  be  found  analytically  by  applying 
finite-horizon  Markov  decision  processes.  A  review  of  Markov  decision  processes  is 
presented  next  to  provide  a  framework  for  the  satellite  replacement  problem. 

3.1  A  Review  of  Markov  Decision  Processes 

Mine  and  Osaki  [30]  define  a  Markov  decision  process  as  “a  sequential  deci¬ 
sion  process  on  a  discrete-time  Markov  chain”  where  a  discrete-time  Markov  chain 
(DTMC)  is  a  stochastic  process  with  specific  properties.  Kulkarni  [26],  page  16, 
defines  a  stochastic  process  {Xn,  n  >  0}  to  be  a  DTMC  with  state  space  S,  where 
Xn  is  the  state  of  the  system  at  time  n.  if  for  all  n  >  0,  Xn  £  S  and  the  Markov, 
or  memoryless,  property  holds  (i.e.  the  history  of  the  process  is  contained  in  the 
current  state  of  the  system,  so  that  only  the  current  state  of  the  system  needs  to 
be  considered).  A  DTMC  transitions  from  state  to  state  at  discrete  time  points.  A 
sample  path  showing  how  a  DTMC  might  transition  is  shown  in  Figure  3.1.  The 
state  to  which  the  process  transitions  is  determined  by  the  transition  probabilities. 
The  state  transitions  are  determined  by  the  system  being  modeled  (state  transitions 
which  are  not  possible  for  a  given  system  have  transition  probabilities  equal  to  zero). 
The  possible  states  and  the  transition  probabilities  determine  how  the  system  evolves 
over  time  and  are  also  dependent  on  the  system  being  modeled. 

Puterman  [35]  states  that  Markov  decision  processes  are  “also  referred  to  as 
stochastic  dynamic  programs  or  stochastic  control  problems.”  White  [39],  page  1, 
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Figure  3.1  Possible  DTMC  sample  path. 

adds  that  the  essential  objective  of  Markov  decision  processes  is  to  determine  which 
of  the  possible  actions  is  optimal  for  each  state.  This  review  of  Markov  decision  pro¬ 
cesses  is  divided  into  three  parts.  The  first  part  discusses  the  formulation  of  a  Markov 
decision  process  model.  The  second  part  addresses  optimality  criteria  for  Markov 
decision  processes.  The  final  section  covers  the  backward  induction  algorithm  and 
linear  programming  formulations  of  a  finite-horizon  Markov  decision  processes.  This 
review  follows,  in  large  part,  from  the  excellent  treatment  by  Puterman  [35]. 

3.1.1  Formulation  of  a  Markov  Decision  Process 

Puterman  [35],  pages  17-22,  clearly  lays  out  the  components  of  a  Markov  de¬ 
cision  process  model  whose  formulation  includes  defining  the  decision  epochs  and 
periods,  the  states  and  action  sets,  the  rewards  and  transition  probabilities,  the 
decision  rules,  and  the  policies. 

A  finite-horizon  model  where  decisions  are  made  at  discrete  time  points  is 
used  to  solve  the  problem.  The  discrete  time  points  at  which  decisions  are  made 
are  referred  to  as  decision  epochs.  Time  is  divided  into  periods  with  decision  epochs 
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representing  the  beginning  of  each  period.  The  set  of  decision  epochs  is  denoted 
T  =  {1,2,  where  N  <  oo  indicates  the  time  frame  is  of  a  finite-horizon.  At 

each  decision  epoch,  the  model  will  be  in  one  of  a  number  of  possible  states  and  a 
decision  must  be  made  concerning  the  course  of  action  to  follow. 

States  represent  all  of  the  possible  scenarios  in  which  the  the  system  can  be 
observed.  The  set  of  all  possible  states  is  known  as  the  state  space.  Labelling  the 
states  by  the  integers  s  =  1,  2, ...,  K ,  the  state  space  is  denoted  as  S  =  { .s i ,  s2, ...,  sk}- 
At  each  decision  epoch  in  which  the  system  is  found  to  be  in  state  s,  a  decision  maker 
must  choose  an  action  a,  where  actions  are  labelled  by  the  integers  a  =  1,2,  ...,L 
from  the  set  of  actions  that  are  available  while  the  system  in  state  s.  The  set  of 
actions  available  while  in  state  s  is  denote  as  As.  The  number  of  actions  L  available 
in  any  state  s  is  dependent  on  that  state  s  and  need  not  be  the  same  for  all  s.  The 
set  of  actions  available  in  state  s  is  denoted  as  As  =  {aS;i,  aS)2,  •  •  • ,  as,L }  where  aS)m 
is  the  mth  possible  action  while  in  state  s.  For  each  action  chosen  by  the  decision 
maker,  there  is  a  corresponding  reward  (or  cost)  for  making  that  decision. 

In  general,  the  reward  a  decision  maker  achieves  for  choosing  action  a  €  As 
in  state  s  at  decision  epoch  t  is  denoted  as  rt(s,a).  In  the  case  where  the  rewards 
remain  the  same  throughout  all  time  periods,  the  notation  can  be  shortened  to 
r(s,a).  Rewards  are  real- valued  and  may  be  positive  or  negative.  They  may  be 
considered  as  income  when  positive  and  as  cost  when  negative.  In  finite-horizon 
Markov  decision  process  models,  no  decision  is  made  at  the  final  decision  epoch  N 
because  a  decision  made  at  decision  epoch  N  would  not  be  implemented  as  decision 
epoch  N  marks  the  end  of  the  time  horizon  being  evaluated.  The  reward  at  the 
final  decision  epoch  is  a  function  of  only  the  state  and  is  denoted  rjv(,SAr).  When 
analyzing  a  finite-horizon  model,  the  final  decision  epoch  N  works  to  summarize  the 
results  of  the  previous  decision.  The  final  reward,  r^(s),  is  sometimes  referred  to  as 
the  salvage  value  because  this  is  the  value  of  the  final  state  of  the  system  at  the  end 
of  the  time  frame  being  evaluated.  Prior  to  the  final  decision  epoch,  the  state  of  the 
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system  at  the  next  decision  epoch  is  based  on  the  current  state  of  the  system,  the 
decision  made  by  the  decision  maker,  and  the  transition  probabilities. 

A  transition  probability  is  the  probability  that  the  system  moves  from  state  s 
to  another  specified  state  at  the  time  of  the  next  decision  epoch  given  action  a  €  As 
was  chosen.  Transition  probabilities  are  denoted  as  Pt(-\s,  a)  where  (•)  represents  the 
state  to  which  the  system  transitions,  given  the  system  is  in  state  s  and  action  a  is 
chosen.  In  the  case  where  the  transition  probabilities  remain  the  same  throughout 
all  time  periods,  the  notation  can  be  shortened  to  p(-|s,  a).  For  models  presented 
here,  it  is  assumed  that 

^2pt(j\s,a)  =  1,  (3.1) 

j&s 

although  Puterman  [35],  page  20,  does  discuss  models  where  this  equality  is  not 
required. 

Decision  rules  define  the  procedure  used  to  select  the  actions  for  each  state 
at  each  decision  epoch.  Decision  rules  can  be  either  deterministic  or  stochastic  and 
either  Markovian  or  non-Markovian  [35],  page  21.  Of  the  different  combinations  of 
the  characteristics,  history-dependent  randomized  policies  II/yi?  are  the  most  gen¬ 
eral  [35],  page  21.  When  decision  rules  are  deterministic,  actions  are  selected  with 
certainty.  When  decision  rules  are  randomized  an  action  is  selected  randomly,  ac¬ 
cording  to  a  specified  probability  distribution,  from  the  set  of  available  actions  for 
that  state.  Markovian  decision  rules  rely  only  on  the  current  state  of  the  system. 
This  follows  from  the  Markov  property  that  states  the  probabilistic  behavior  of  the 
future  of  the  process  depends  only  on  the  current  state  of  the  process  [26],  page 
16.  In  this  sense,  the  history  of  the  process  is  contained  in  the  current  state  of  the 
process.  History-dependent  decision  rules  depend  on  the  past  history  of  the  system 
as  represented  by  all  of  the  previous  states  and  actions. 

In  this  work,  decision  rules  will  be  deterministic  and  Markovian.  Markovian 
deterministic  policies  njUD  are  a  subset  of  history-dependent  randomized  policies 
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[35],  page  22.  Puterman,  [35],  page  89,  shows  that  deterministic  decision  rules  lead 
to  optimal  policies  and  therefore  randomized  decision  rules  will  not  be  considered 
here.  Deterministic  Markovian  decision  rules  are  functions,  dt(s)  :  S  — >  As,  where 
s  €  S,  dt(s )  G  As,  which  determine  the  the  action  chosen  whenever  the  system  is  in 
state  s  at  decision  epoch  t. 

Defining  a  policy  is  the  final  step  in  formulating  a  Markov  decision  process.  A 
policy  7T  is  a  sequence  of  the  decision  rules  used,  n  =  (di,  d2,  ...rfjv-i)-  A  policy  that 
uses  the  same  decision  rule  for  all  decision  epochs,  dt  —  d,  V  t  is  called  a  stationary 
policy. 

In  summary,  Markov  decision  processes  are  composed  of  decision  epochs  and 
periods,  states  and  action  sets,  rewards  and  transition  probabilities,  decision  rules, 
and  policies.  The  models  presented  herein  are  finite-horizon  models,  i.e.,  there  is  a 
finite  number  of  discrete  decision  epochs.  The  decision  epochs  occur  at  the  begin¬ 
ning  of  each  period.  States  represent  the  different  conditions  in  which  the  system 
can  be  observed.  For  each  state  that  the  system  can  assume,  there  is  a  set  of  actions 
from  which  a  decision  maker  can  choose.  These  actions,  along  with  the  transition 
probabilities,  determine  which  state  the  system  will  be  in  at  the  next  decision  epoch. 
There  is  a  reward  associated  with  the  selection  of  each  action.  A  reward  can  be 
interpreted  as  either  income  or  cost  and  results  from  the  selection  of  a  particular 
action.  Decision  rules  specify  how  that  action  can  be  chosen  and  policies  specify 
which  action  is  chosen  throughout  the  time  horizon  under  consideration.  The  de¬ 
cision  epochs,  states,  action  sets,  rewards,  and  transition  probabilities  make  up  a 
Markov  decision  process.  The  combination  of  these  components  combined  and  an 
optimality  criterion  is  known  as  a  Markov  decision  problem. 

3.1.2  Optimality  Criteria 

In  order  to  determine  an  optimal  policy  for  a  Markov  decision  process,  there 
must  be  a  means  by  which  to  compare  policies  to  determine  an  ordering.  Markov 
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decision  process  models  are  stochastic  models,  therefore  the  policies  (e.g.  n  = 
(di,  g?2,  ...djv— i))  are  vectors  of  random  variables.  Because  policies  are  vectors  of 
random  variables,  stochastic  ordering  is  required  of  these  vectors. 

The  following  notation  will  aid  in  the  discussion  of  stochastic  ordering  and 
policy  comparison.  Let  Rt  denote  the  random  reward  received  in  time  period  t  when 
t  <  N  and  let  Rn  denote  the  reward  of  the  final  decision  epoch  or  the  salvage 
value  [35],  page  74.  Normally,  for  t  <  N  the  rewards  are  independent  of  the  time 
period  and  the  subscript  t  is  dropped  for  stationary  rewards.  M  denotes  the  set  of  all 
real  numbers,  and  Mn  denotes  all  n-dimensional  vectors  of  real  values.  The  vector 
R  =  (R\, . . . ,  Rn)  G  Mn  denotes  a  random  sequence  of  rewards.  Finally,  K  denotes 
the  set  of  all  possible  reward  sequences. 

For  a  pair  of  random  variables  it  is  said  that  the  random  variable  U  is  stochas¬ 
tically  greater  than  the  random  variable  V  if 

P{V  >t}<  P{U  >t},  V  te  R.  (3.2) 

A  random  vector  U  =  (Ui, ,  Un)  is  stochastically  greater  than  a  random  vector 
V  =  (V1,...,Vn)  if 

E[f(Vu  ...,VN)}<  E[f{Uu  ...,UN)},  V  /  :  Rn  -  R  (3.3) 

where  the  expectation  is  finite  and  the  partial  ordering  on  Mn  is  maintained,  such 
that  i>i  <  Ui  for  i  —  1, . . . ,  N  and  /(tq, . . . ,  vn)  <  f(ui, . . . , un)  [35],  page  75. 

According  to  Puterman  [35],  page  77,  when  using  stochastic  ordering  to  com¬ 
pare  two  policies  n  and  v  the  inequality 

E*[f{Ru  •  •  • ,  Rn)]  >  Eu[f(Ri, ...,  RN)]  (3.4) 
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must  hold  for  a  large  class  of  functions  /  which  may  not  be  representative  of  the 
decision  maker’s  tolerance  for  risk.  The  issue  of  risk  is  raised  because  a  decision 
chosen,  based  solely  on  maximizing  the  expected  value,  may  seem  more  risky  to 
a  decision  maker  when  compared  to  another  decision  with  a  lower  expected  value. 
For  example,  Clemens  and  Reilly  [12],  page  529,  present  a  game  similar  to  the 
following.  Suppose  there  are  two  possible  decisions,  decision  A  and  decision  B. 
Each  decision  has  two  possible  outcomes  with  each  outcome  having  a  probability  of 
0.5.  The  possible  outcomes  of  decision  A  are  gains  of  $1  and  $100,  which  results  in 
an  expected  gain  of  $50.50.  The  possible  outcomes  of  decision  B  are  gains  of  $40 
and  $60,  which  results  in  an  expected  gain  of  $50.  If  maximizing  the  expected  value 
was  the  only  concern  decision  A  should  be  chosen.  However,  some  people  would 
think  decision  A  is  riskier  than  decision  B  because  choosing  decision  B  guarantees  a 
gain  of  at  least  $40  with  a  chance  for  more,  but  decision  A  only  guarantees  a  gain  of 
$1.  Therefore,  it  may  be  desirable  to  consider  risk  along  with  expected  value  when 
evaluating  decisions.  According  to  Bertsekas  [8],  pages  4  and  8,  expected  utility  theory 
can  be  used  to  apply  mathematical  methods  for  analyzing  decision  problems  when 
the  decision  maker  is  able  to  rank  order,  by  preference,  the  probability  distribution 
of  each  possible  outcome. 

Puterman  [35],  page  77,  states  that  requiring  Equation  (3.4)  to  only  hold  for 
a  specified  function  allows  utility  theory  to  provide  a  useful  means  of  policy  com¬ 
parison.  A  utility  function,  T(-),  is  a  real- valued  function  representing  a  decision 
maker’s  preference  for  elements  in  a  set  W .  If  the  decision  maker  prefers  v  over 
w  this  implies  T (v)  >  T(tc).  Also,  if  T (v)  >  T( w ),  this  implies  that  the  decision 
maker  prefers  v  over  w.  Thus,  this  is  an  if-and-only-if  relationship.  If  the  decision 
maker  has  no  preference  between  v  and  w  then  T(u)  =  ^(w).  Using  utility  T(-)  all 
of  the  elements  in  set  W  can  be  compared.  Describing  techniques  for  determining 
utility  functions  is  beyond  the  scope  of  this  work.  The  interested  reader  is  referred 
to  Fishburn  [16]  and  Keeney  and  Raiffa  [25]. 
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If  the  elements  of  W  are  allowed  to  represent  the  outcomes  of  a  random  process 
then,  according  to  Puterman  [35],  page  77,  the  “expected  utility  provides  a  total 
ordering  on  equivalence  classes  of  outcomes.”  The  expected  utility  for  a  discrete 
random  variable  Y  is  given  by 

£(T(Y'))  =  vl '(y)P{Y  =  y}.  (3.5) 

yew 

Let  (pi,...,pn)  represent  a  realization  of  the  reward  process  and  let  denote 
the  probability  distribution  on  the  set  of  rewards.  Then  for  finite-horizon  Markov 
decision  process  models  with  discrete  state  spaces  the  expected  utility  of  a  policy  n 
can  be  represented  as 


£'[*(«)]=  ^  *(pi . . Pjv)>-  (3.6) 

(pi,-,PAT)e* 


When  using  expected  utility  it  is  clear  the  decision  maker  prefers  policy  n  to  policy 
u  if 


. . . ,  Rn )]  >  EV[^(R1, . . . ,  Rn )].  (3.7) 


The  models  presented  herein  assume  linear  additive  utility  which  is  given  by 

N 

=  ^2  Pi-  (3-8) 

i=  1 

Linear  additive  utility  is  used  because  as  pointed  out  by  Puterman  [35],  pages  77  and 
78,  such  utilities  represent  the  preferences  of  a  decision  maker  that  is  risk  neutral 
and  indifferent  to  the  timing  with  which  the  rewards  are  received. 

Let  Vh(s)  be  the  expected  total  reward  over  the  decision-making  horizon  when 
policy  7 r  is  used  and  the  system  begins  in  state  s.  Then  the  expected  total  reward 
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for  a  deterministic  Markov  policy  is  given  by 


rt(st,  dt(st ))  +  rN(sN)  j>  .  (3.9) 

Now  that  the  expected  total  reward  has  been  defined,  the  optimal  policy,  based 
on  that  expected  total  reward,  is  established.  For  a  model  where  reward  is  maxi¬ 
mized,  the  policy  with  the  largest  expected  total  reward  is  desired  and  is  denoted  as 
Ti*.  This  optimal  policy  n*  is  found  when  (cf.  [35],  page  79) 

Vn(s)>Vh(s),  seS 


v%(s)  =  e: 


'  N-l 

£ 

t= i 


for  all  policies  n  E  UHH . 

The  value  of  a  Markov  decision  problem,  v%,  for  the  maximization  case,  where 
sup  represents  the  supremum  and  max  represents  the  maximum,  is  given  by 

v*N(s)  =  sup  v„(s),  s  E  S,  (3.10) 

nenHR 

or 

vtr(s)  =  max  vZr(s),  s  E  S,  (3-11) 

?r6  Uhr 

when  the  value  of  the  supremum  is  attained  in  Equation  (3.10),  such  as  when  each 
ASt  is  finite  [35],  page  79.  The  logic  remains  the  same  for  the  minimization  case  with 
inhmum  replacing  supremum  and  minimum  replacing  maximum. 

The  expected  total  reward  of  an  optimal  policy  n*  is  the  same  as  the  value  of 
the  Markov  decision  problem  and  thus  satisfies 

vn  (s)  =  v*n(s )>  S  E  S.  (3.12) 
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Markov  decision  processes  are  nothing  more  than  stochastic  dynamic  program¬ 
ming  problems.  Dynamic  programming  makes  use  of  a  fundamental  recursion  to 
efficiently  calculate  a  result.  To  perform  this  recursion,  in  the  context  of  Markov 
decision  processes,  the  expected  total  reward  of  a  fixed  policy  must  first  be  defined. 
Let  rtf  denote  the  total  expected  reward  obtained  by  implementing  policy  7 r  at  the 
decision  epochs  t,t  + 1, . . . ,  N  —  1  [35],  page  80.  For  a  Markovian  deterministic  policy, 
u’l  when  t  <  N  is  given  by 


<(st) 


,N-i  . 

k\  £  rn{sn,dn(sn ))  +  r-jv(sjv)  > 

^  n—t 

r,(s>t,d,(s,))  +  £pt(j|st,<ft(s<))Mm(j)- 
i  es 


(3.13) 

(3.14) 


The  difference  between  u J  (s)  and  ujv(s)  defined  above  in  Equation  (3.9)  is  that  Uf(s) 
only  includes  rewards  from  decision  epoch  t  forward,  whereas  Vn(s)  includes  rewards 
for  the  entire  future  [35],  page  80. 


3.1.3  Solution  Methods 

The  expected  total  reward  Vn  can  be  computed  by  inductively  evaluating  u * 
using  Puterman’s  [35],  page  80,  Finite  Horizon  Policy  Evaluation  Algorithm. 

Finite  Horizon  Policy  Evaluation  Algorithm  (for  fixed  n  G  IiMD) 

1.  Set  t  =  N  and  u^(sn)  =  tn(sn). 

2.  If  t  —  1,  stop,  otherwise  goto  Step  3. 

3.  Substitute  t  —  1  for  t  and  compute  uj(st)  by 

<(st)  =  rt(sudt(st))  +  '^2pt(j\st,dt(st))u*+1(j).  (3.15) 

ies 

4.  Return  to  Step  2. 
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Optimality  equations  provide  a  basis  for  determining  optimal  policies  and  for 


the  maximization  case  are  given  by 


ut(st)  =  sup  <  rt(st,  a)  +  pt(j\st,a)ut+l(st,a,j) 

1&As*  '  j£S 


(3.16) 


for  t  —  1, . . , ,  N  —  1,  and  the  boundary  condition 


Un(sn)  —  r  n(sn) 


(3.17) 


when  t  =  N.  When  the  supremum  in  Equation  (3.16)  is  attained,  the  supremum 
operation  can  be  replace  with  the  maximum  as  follows 


ut(st)  =  max  ^  rt(st,a )  +  VV(j \st,  a)ut+1(su  a,  j) 

aeAst  |  ^ 

s 


(3.18) 


The  optimality  equations  reduce  to  the  policy  evaluation  equations,  Equation 
(3.14)  or  Equation  (3.15),  when  the  supremum  of  all  of  the  actions  in  state  st  is 
replaced  by  the  action  specified  by  the  policy  being  evaluated. 

According  to  Puterman,  [35]  page  84,  the  optimality  equations  are  fundamental 
to  Markov  decision  theory  because  of  the  following  properties: 

Property  A:  The  solutions  to  the  optimality  equations  provide  the  optimal  return 
values  from  period  t  to  N . 

Property  B:  The  optimality  equations  determine  if  a  policy  is  optimal.  If  the 
expected  total  reward  of  policy  n  for  periods  t  onward  satisfy  the  optimality 
equations  for  t  —  1, . . . ,  N,  then  the  policy  is  optimal. 

Property  C:  The  optimality  equations  provide  an  efficient  procedure  for  determin¬ 
ing  optimal  return  functions  and  policies. 
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Property  D:  The  optimality  equations  can  be  used  to  determine  structural  prop¬ 
erties  of  optimal  return  functions  and  policies. 

The  above  properties  are  very  important  to  the  study  of  Markov  decision  pro¬ 
cesses.  A  proof  of  the  properties  is  beyond  the  scope  of  this  thesis,  although  the 
interested  reader  is  directed  to  Puterman  [35],  page  84. 

Puterman  [35],  page  92,  presents  backward  induction  as  an  efficient  method 
for  solving  finite-horizon  discrete-time  Markov  decision  process  problems.  Puterman 
[35],  page  92,  also  states  that  for  stochastic  problems  the  enumeration  and  evaluation 
of  all  policies  is  the  only  way  to  find  the  solution.  The  Backward  Induction  Algorithm 
presented  below  generalizes  the  policy  evaluation  algorithm. 

Backward  Induction  Algorithm 


1.  Set  t  =  N  and  u*N(sN )  =  rN(sN)W  sN  G  S. 

2.  Substitute  t  —  1  for  t  and  compute  ul(st)  for  each  st  €  S  by 


u 


"(st )  =  max  <J  rt(st,a)  +  ^Pt(j\su  a)u*t+l(j)  f  . 

jes 


(3.19) 


Set 


A*sut  =  argmax  <J  rt(st,  a )  +  ^Pt(j\st,  a)u*t+1(j ) 

j&s 


(3.20) 


where  argmaxagAst  returns  the  set  of  actions  (e.g.  {cii.i,  01,3})  which  attain 
the  maximum  value  of  the  elements  evaluated. 

3.  If  t  —  1,  stop.  Otherwise  return  to  Step  2. 


The  Backward  Induction  Algorithm  is  employed  to  find  the  optimal  policies  for  the 
models  presented  in  this  thesis. 


For  completeness,  it  should  be  noted  that  it  is  also  possible  to  formulate  the 
problem  as  a  linear  program.  Derman  and  Klein  [14],  Ross  [36],  and  White  [39] 
all  discuss  linear  programming  formulations  for  finite-horizon  Markov  decision  pro- 
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cesses.  The  following  formulations  are  based  on  the  treatments  of  linear  programming 
formulations  given  by  White  [39],  pages  113-114,  and  Ross  [36],  pages  40-42. 

The  problem  of  finding  the  maximum  expected  total  reward  can  be  formulated 


as 


min 

U 


Xu  = 


£ 

s£S 


AsUiO) 


(3.21) 


subject  to 


ut(s)  >  max  <J  rt(st,  a )  +  ^Pt{j\st,  a)ut+1(j)  [>  , 

jes 


1  <  t  <  N  —  iy  s  G  £ 


(3.22) 


uN(s )  =  0,  t  =  N,Vs  eS  (3.23) 

where  A  is  a  vector  whose  length  is  the  number  of  states  in  the  system.  The  elements 
of  A  represent  the  probability  of  the  system  beginning  in  state  s  G  S.  The  decision 
variables  ut)t  =  1,2, ...  ,N  are  unrestricted  in  sign.  Recall  from  Equation  (3.14)  that 
ut(s)  represents  the  total  expected  reward  through  time  periods  t,t  +  1, . . . ,  N  —  1. 
Equations  (3.22)  and  (3.23)  correspond  to  the  optimality  equations  and  boundary 
condition,  respectively. 

The  formulation  can  be  equivalently  stated  as 

min  <  Aw  =  £  Kui{s)  \  (3.24) 

“l  seS  ) 

subject  to 


ut(s)  >  rt(st,a)  +  ^2pt(J\st,a)ut+i(J),  (3.25) 

j&s 

l<t<N-l,Vs  e  sy  a  e  A 
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un(s)  =0,  t  =  TV,  V  s  €  5 


(3.26) 


In  some  cases,  such  as  the  problem  solved  by  this  research,  where  expected  total 
cost  is  being  minimized  the  problem  can  be  formulated  with  the  objective  function 
written  as  follows: 

T1 1 E  E  u*(s)  |  •  (3.27) 

l  t=i  seS  J 

In  these  cases,  the  value  of  the  objective  function  is  ignored  and  solution  is  found  by 
looking  at  the  value  of  the  decision  variables. 

By  using  the  linear  programming  formulations,  the  optimal  (minimum)  costs  at 
each  decision  epoch  may  be  found.  These  values  are  the  same  as  those  found  when 
solving  the  problem  using  Markov  decision  processes.  Unfortunately,  solving  the 
problem  via  linear  programming  does  not  provide  the  policy  that  must  be  followed 
to  obtain  these  optimal  values.  For  this  reason  it  is  desirable  to  use  Markov  decision 
processes  to  solve  the  problem.  Being  able  to  find  solutions  to  the  problem  via 
linear  programming  provides  a  useful  check  on  the  results  of  the  stochastic  dynamic 
programming  solution.  In  addition,  the  linear  programming  formulation  allows  for 
sensitivity  analysis  to  be  extended  to  an  analysis  of  the  reward  (cost)  coefficients  of 
the  problem. 

3.2  Assumptions  for  the  Satellite  Constellation  Models 

Various  assumptions  regarding  satellites  and  their  operation  are  employed  in 
order  to  model  the  system  as  a  Markov  decision  process,  to  prevent  state  space 
explosion  of  the  models,  and  to  produce  tractable  solutions.  Chapter  5  discusses 
the  relaxation  of  these  assumptions  in  future  work;  however,  in  these  initial  models 
the  assumptions  assist  in  focusing  attention  on  the  solution  method  rather  than 
attempting  to  provide  a  perfectly  realistic  model.  The  following  two  assumptions 
are  required  to  model  satellite  constellations  as  presented  in  Sections  3.3  and  3.4. 
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•  Satellite  are  assumed  to  be  independently  subject  to  failure.  This  implies  that 
such  events  as  solar  proton  events  and  geomagnetic  storms  do  not  affect  the 
failure  of  the  satellites.  Satellite  failures  are  rarely  caused  by  such  events, 
therefore  this  assumption  can  be  considered  to  have  minimum  impact. 

•  The  assumption  of  exponentially  distributed  satellite  lifetimes  greatly  simpli¬ 
fies  the  modeling  of  satellite  constellations.  Exponential  lifetimes  enable  the 
computation  of  conditional  probabilities  involving  the  interval  reliability  of  the 
satellites.  Barlow  and  Proschan  [6],  page  18,  suggest  that  it  is  valid  to  assume 
exponentially  distributed  lifetimes  for  complex  systems  with  many  critical,  in¬ 
dependently  operating  components  as  the  number  of  components  and  time 
in  operation  increases.  Given  the  complexity  and  criticality  of  subsystems  it 
assumed  reasonable  to  employ  this  assumption. 

In  an  effort  to  present  the  methodology  and  analysis  in  a  clear  and  understand¬ 
able  manner,  the  following  additional  assumptions  were  made.  Though  the  following 
are  not  rigid,  they  are  employed  in  order  to  more  easily  demonstrate  the  analysis  of 
the  model’s  output. 

•  For  the  models  presented  herein,  it  is  assumed  that  when  a  spare  satellite  is 
ordered,  it  will  be  available  during  the  next  time  period.  In  reality,  it  can 
sometimes  take  many  months  to  build  a  satellite.  Molnau,  Olivieri,  and  Spalt 
[31]  state  that  for  traditional  space  vehicle  manufacturing  it  can  typically  take 
18  months  to  build  a  satellite.  For  modern  manufacturing  processes,  the  time 
between  producing  multiple  satellites  can  be  reduced  to  two  months.  This 
assumption  simplifies  the  modeling  of  satellite  constellations.  Because  satellite 
lifetimes  are  relatively  long  in  relation  to  the  time  it  takes  to  build  a  new 
satellite  and  the  high  reliability  of  launch  boosters  the  likelihood  that  a  new 
satellite  can  be  built  prior  to  a  recently  replaced  satellite  needing  replacement 
is  high.  For  example,  many  satellites  currently  have  lifetimes  of  ten  years  or 
more,  but  satellite  production  times  are  typically  much  shorter  than  this  time 
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period.  The  main  effect  of  allowing  satellite  production  to  exceed  three  months 
is  that  orders  for  replacement  satellites  would  need  to  be  place  earlier. 

•  The  models  also  assume  that  there  will  be  at  most  one  spare  satellite  for  each 
satellite  considered  an  active  part  of  the  constellation.  There  could  actually 
be  any  number  of  spare  satellites  from  none  to  multiple  spare  satellites  per 
active  satellite.  For  example,  a  Lockheed  Martin  press  release  [37]  states  that 
in  January  of  2001  there  were  14  GPS  satellites  in  storage.  Allowing  one  spare 
per  active  satellite  should  allow  ample  spare  satellites  to  be  kept  on  hand.  It 
should  also  provide  a  reasonable  modeling  bound  for  determining  the  state 
space  where  the  number  of  spare  satellites  is  a  factor  in  state  definitions. 

•  Satellites  are  assumed  to  be  either  operational  or  non-operational.  Most  satel¬ 
lites  suffer  from  some  degradation  due  the  stresses  of  launch  and  the  harsh 
space  environment.  Yet  these  degraded  satellites  are  still  capable  of  perform¬ 
ing  some  necessary  functions  required  by  the  mission.  For  the  purpose  of  this 
study,  these  satellites,  even  though  degraded,  are  considered  operational  until 
the  degradation  reaches  the  point  at  which  the  mission  under  evaluation  can  no 
longer  be  performed  at  a  satisfactory  level.  When  a  satellite  reaches  this  level 
of  degradation,  for  the  purpose  of  this  model  and  the  mission  being  evaluated 
it  is  considered  non-operational.  Secondary  payloads  or  missions  need  to  be 
evaluated  separately. 

•  While  performing  policy  evaluation,  the  models  rely  on  the  assumption  that 
the  costs  involved  are  known  with  certainty.  It  is  reasonable  that  the  cost  of 
building  a  satellite,  the  cost  of  storing  a  satellite,  and  the  cost  of  launching  a 
satellite  are  fairly  well  known.  Conversely,  the  penalty  cost  which  is  assessed 
when  the  constellation  is  not  fully  operational  is  not  as  obvious.  There  exist 
a  directly  proportional  relationship  between  the  penalty  cost  and  the  opera¬ 
tional  level  of  the  satellite  constellation.  As  the  penalty  cost  rises,  so  will  the 
operational  level  of  the  satellite  constellation.  This  relationship  exists  because 
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as  the  penalty  cost  rises  the  penalty  charges  can  be  avoided  by  making  more 
frequent  satellite  replacements  in  order  to  prevent  outages.  Buffa  and  Miller  [9] 
pages  142-143,  present  the  idea  of  service  levels  wherein  penalty  cost,  cpenaity, 
can  be  expressed  as  some  function  of  the  service  level,  ls  where 


Cpenaity  =  f(h)-  (3.28) 

Here  /  is  a  utility  function  as  described  in  Section  3.1.2.  It  is  beyond  the  scope 
of  this  work  to  determine  the  utility  function  for  the  penalty  cost.  A  nominal 
value  for  the  penalty  cost  is  chosen  in  Chapter  4  and  sensitivity  analysis  is 
performed  on  that  value. 

•  For  the  purpose  of  this  research  the  models  assume  that  the  cost  of  each  satellite 
is  constant  throughout  the  time-  horizon  being  evaluated.  The  purpose  of  this 
assumption  is  to  focus  attention  on  the  methodology  and  to  make  analysis  of 
the  model  outputs  more  straightforward.  Often  a  contract  will  be  made  to 
build  several  satellites  over  a  period  of  time.  These  contracts  allow  the  buyer 
to  obtain  lower  prices  for  each  individual  satellite.  For  example,  Jane’s  Space 
Directory  [5],  page  573,  reports  that  for  GPS  Navstar  Block  2R  satellites  there 
was  a  design  and  development  cost  of  119  million  dollars  and  that  the  first  20 
satellites  after  that  would  cost  a  total  of  575  million  dollars.  The  total  cost 
of  development  and  production  for  the  first  20  satellites  is  694  million  dollars. 
The  average  cost  per  satellite  is  34.7  million  dollars.  Such  long  term  contracts 
lock  in  the  cost  of  satellites  and  allow  the  average  price  to  be  used  as  a  constant 
cost  for  modeling. 

•  Each  satellite  is  assumed  to  be  launched  separately,  although  in  some  cases 
multiple  satellites  could  be  launched  from  the  same  booster.  This  is  especially 
evident  when  launching  multiple  satellites  into  the  same  orbital  plane.  For 
example,  Iridium  satellites  had  two  satellites  per  launch  on  Long  March  2 
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rockets,  five  satellites  per  launch  on  Delta  2  rockets,  and  seven  satellites  per 
launch  on  Proton  K/DM  rockets  [22]  pages  223,  121,  and  292.  Assuming  only 
one  satellite  per  launch  is  acceptable  because  this  is  the  most  frequent  case. 
It  also  provides  a  conservative  estimate  of  the  cost.  With  the  exception  of 
smaller,  lighter  weight  satellites  such  as  Iridium  and  Globalstar,  most  satellites 
are  launched  individually  [22],  pages  65-66,  107-110,  196-200,  222-223,  288-292, 
and  364-376. 

•  The  launch  costs  per  satellite  are  also  assumed  to  be  constant.  Again,  the  main 
reason  to  hold  launch  cost  throughout  the  time  horizon  under  evaluation  is  to 
focus  attention  on  the  methodology  and  make  analysis  of  model  output  easier 
to  interpret.  This  is  assumed  because  each  satellite  is  assumed  to  be  launched 
separately.  If  multiple  satellites  were  allowed  to  be  launched  together,  then  a 
different  booster  may  be  required  that  has  different  costs.  For  example,  Long 
March  2,  Delta  2  and  Proton  rockets  were  among  the  launchers  used  to  launch 
Iridium  satellites  [22],  pages  223,  109,  and  292.  Delta  2  launches  boosted  five 
Iridium  satellites  per  launch  into  orbit  at  a  cost  of  50  to  60  million  dollars  [22], 
page  98.  Proton  launches  put  seven  Iridium  satellites  per  launch  into  orbit  at 
a  cost  of  90  to  100  million  dollars,  [22]  page  284.  Depending  on  the  number 
of  satellites  launched  at  the  same  time  and  booster  required  many  different 
launch  cost  could  be  possible  for  the  satellites.  By  considering  each  satellite 
to  launched  separately  the  launch  cost  can  be  assumed  constant  and  system 
modeling  is  simplified.  Again,  this  would  be  a  conservative  estimate  of  costs. 

•  On-orbit  spare  satellites  are  not  considered  in  the  models.  Many  constellations 
do  not  make  use  of  on-orbit  spares  due  to  the  added  cost  and  the  extra  wear 
to  the  satellite  caused  by  additional  exposure  to  the  space  environment.  For 
this  reason  and  to  keep  the  state  space  of  the  problem  small,  on-orbit  spares 
are  not  considered. 
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•  The  models  assume  that  launch  facilities  and  launchers  are  available  for  all 
launches  so  that  launches  occur  at  the  prescribed  times.  The  Delta  2  rocket 
has  demonstrated  a  peak  launch  rate  of  12  launches  per  year  and  it  is  estimated 
that  the  Delta  2  has  a  maximum  surge  launch  rate  of  15  launches  per  year  [22], 
page  99.  This  type  of  limitation  exist  for  all  launchers,  for  example,  the  Proton 
rocket  is  currently  being  produced  at  a  maximum  rate  of  15  per  year  [22],  page 
285.  The  relatively  long  satellite  lifetimes  make  it  unlikely  that  large  numbers 
of  launches  will  be  needed  each  year,  under  normal  circumstances,  to  maintain 
a  constellation. 

•  The  models  (as  presented)  do  not  assume  any  budgetary  constraints.  For  the 
purpose  of  finding  the  optimal  policy  for  maintaining  the  satellite  constellation 
with  minimum  expected  total  cost  this  is  a  fair  assumption.  The  optimal  pol¬ 
icy  provides  the  minimum  expected  total  cost  over  the  time  horizon  evaluated. 
A  policy  derived  under  such  an  assumption  is  useful  in  creating  a  budget  or 
requesting  funds  for  the  maintenance  of  the  constellation.  Having  derived  this 
optimal  replacement  policy  places  strong  impetus  on  following  the  policy  be¬ 
cause  following  any  other  course  of  action  will  have  a  higher  minimum  expected 
total  cost. 

However,  if  the  decision  is  made  to  limit  the  funding  for  maintaining  a  satellite 
constellation  during  any  time  period,  it  is  easy  to  implement  budgetary  con¬ 
straints.  By  implementing  budgetary  constraints,  the  new  policy  will  result  in 
a  minimum  expected  cost  that  is  greater  than  or  equal  to  the  unconstrained 
model.  To  implement  the  budgetary  constraints  the  actions  that  result  in  an 
immediate  monetary  reward  (cost)  exceeding  the  allowable  budgeted  amount 
can  be  ignored  as  if  these  actions  did  not  exist.  In  this  way,  an  optimal  policy 
can  be  found  that  minimizes  the  total  expected  cost  of  maintaining  the  satellite 
constellation  while  still  satisfying  budgetary  constraints. 
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•  This  model  only  takes  into  account  the  cost  directly  associated  the  building, 
storage,  and  launch  of  a  satellite,  as  well  as  the  opportunity  cost  of  a  satellite 
failure.  This  model  does  not  include  the  costs  of  maintaining  a  ground  system 
with  which  to  operate  the  satellite. 

3.3  Single-unit  Model 

This  section  presents  the  formulation  for  a  model  with  a  single  satellite.  In 
this  case,  the  number  of  satellites  in  the  constellation  is  M  =  1.  As  presented  in 
Section  3.1.1,  it  is  necessary  to  define  decision  epochs,  periods,  states,  action  sets, 
rewards,  transition  probabilities,  decision  rules,  and  policies. 

The  single  satellite  model  is  a  finite  time  horizon  model  such  that  T  =  {1,2,  ...,1V} 
where  N  <  oo.  The  states  of  the  single  satellite  model  represent  the  states  of  the 
stochastic  process  {(Xt,  St)  :  t  G  T}  where  Xt  represents  the  whether  the  satel¬ 
lite  is  operational  and  Sn  represents  if  a  spare  satellite  is  available.  For  example, 
(A^  =  1,  S5  =  0)  means  that  at  t  —  5  the  satellite  is  operational  (X5  =  1)  and  there 
is  not  a  spare  satellite  available  (S5  =  0).  The  state  space  for  the  model  contains 
four  states  and  is  denoted  as  S  =  { .s'  1 ,  s2,  s3,  S4}.  The  states  are  defined  in  Table  3.1. 


Table  3.1  Single  satellite  model  state  definitions. 


State 

Definition 

-Sl 

the  satellite  is  working  and  no  replacement  satellite  is  available 

•S  2 

the  satellite  is  working  and  a  replacement  satellite  is  available 

«3 

the  satellite  is  not  working  and  a  replacement  satellite  is  not  available 

s4 

the  satellite  is  not  working  and  a  replacement  satellite  is  available 

The  set  of  possible  actions  for  each  state  depend  explicitly  on  the  state.  When 
the  system  is  in  state  s  the  actions  from  set  As  are  available  for  selection.  The 
actions  for  the  four  states  of  this  model  are  defined  in  Table  3.2. 

Transition  probabilities  for  the  model  are  defined  using  the  following  notation: 
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Table  3.2  Single  satellite  model  action  definitions. 


Action  Set 

Action 

Definition 

A, 

Ol,l 

Let  the  system  run  without  intervention 

«1,2 

Order  a  replacement  satellite 

a2 

«  2,1 

Let  the  system  run  without  intervention 

«2,2 

Replace  the  satellite  with  the  available  spare 

°'2,3 

Replace  the  satellite  and  order  a  new  replacement 

A3 

a3,l 

Let  the  system  run  without  intervention 

a3,2 

Order  a  replacement  satellite 

a4 

«4,1 

Let  the  system  run  without  intervention 

«4,2 

Replace  the  satellite  with  the  available  spare 

«4,3 

Replace  the  satellite  and  order  a  new  replacement 

Interval  reliability:  The  interval  reliability  of  a  satellite,  denoted  by  Rsat,  is  the 
probability  that  the  satellite  will  survive  until  the  next  decision  epoch.  A 
derivation  of  interval  reliability  is  given  in  Proposition  3.1. 

Probability  of  a  successful  launch:  The  probability  of  a  successful  launch,  de¬ 
noted  as  Psi ,  is  the  probability  that  a  satellite  is  launched  and  becomes  oper¬ 
ational.  This  probability  includes  the  event  of  a  successful  launch  into  orbit, 
any  transfer  maneuvers  into  the  final  orbit,  and  successful  completion  of  initial 
check-out  procedures  until  the  satellite  is  declared  operational. 

The  transitions  for  this  model  are  shown  pictorially  in  the  state  transition 
diagram  of  Figure  3.2. 


Proposition  3.1  derives  the  interval  reliability  when  satellite  lifetimes  are  dis¬ 
tributed  exponentially.  The  memoryless  property  of  the  exponential  distribution  is 
exploited  in  solving  the  problem  using  Markov  decision  processes. 

Proposition  3.1  If  W  is  the  expone?itially  distributed  lifetime  of  a  satellite  with 
mean  lifetime  A  and  T\  and  T2  G  [0,  oo),  where  r2  =  Ti  +  At  (At  is  the  length  of  one 
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Figure  3.2  State  transition  diagram. 


time  period),  then 


P{W  >  t2\W  >  r, }  =  e~XAt,  At  >  0. 


Proof.  The  proof  of  Proposition  3.1  follows. 

P{W>r2\W>n}  =  P{W  >  n  +  At\W  >  n} 

P{W  >  n  +  At,  W  >  n} 
P{fF  >  n} 

P{fF  >  n  +  At} 

P{fF  >  n} 

(1  -  (1  -  e“ ■ Mn+At))) 

"  (l-(l-e-^))  ' 

p-A(n+A  t) 


Table  3.3  summarizes  the  transition  probabilities  for  the  single-unit  model. 


Table  3.3  Single  satellite  model  transition  probabilities. 


Current  State 

Probability 

Value 

-Sl 

Pt(si  |si,ai,i) 

Rsat 

Pt(s3|si,oi,i) 

1  —  Rsat 

Pt(s2\s1,  ah2) 

Rsat 

Pt(s4  Sl,  dig) 

1  Rsat 

S2 

Pt(s2|s2,  «2,l) 

Rsat 

Pi(s4|s2,  d2,l) 

1  Rsat 

Pt(si|s2,a2,2) 

Psl 

Pi(s3|s2,a2j2) 

1-Psl 

pi(s2|s2,a2j3) 

Psl  +  Rsat.  X  (1  —  Psl ) 

Pt(s4\s2,  a2j3) 

(1  -  Rsat )  X  (1  -  Psi ) 

Pt(s3|s3,a3,i) 

1 

Pi(s4|s3,  a3)2) 

1 

s4 

Pt  (s4 1 -S4,  d4)i ) 

1 

Pi(si|s4,  a4)2) 

Psl 

Pt(s3\s4l  a4)2) 

1  _  Psl 

Pt{s2  S4,  d4)3) 

Psl 

Pt (s4 1 S4,  d4)3) 

1-  Psl 

In  this  model  all  of  the  rewards  are  actually  costs.  The  costs  used  here  are 

defined  as  follows: 

Satellite  Costs:  Satellite  costs,  denoted  as  csat ,  are  all  of  the  costs  involved  in 
purchasing  a  new  satellite. 

Holding  Costs:  Holding  costs,  denoted  as  Choid,  are  the  costs  associated  with  stor¬ 
ing  a  satellite  prior  to  its  launch. 

Launch  Costs:  Launch  costs,  denoted  as  Ciaunch ,  include  all  of  the  costs  associated 
with  the  launching  of  a  satellite.  These  costs  include  the  launch  booster,  the 
shipment  of  the  launch  booster,  launch  range  support,  and  etc. 

Penalty  Costs:  Penalty  cost,  denoted  as  cpenaity,  are  the  costs  associated  with  not 
keeping  each  satellite  in  the  constellation  at  an  operational  level. 
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The  rewards  for  the  single  satellite  model  are  shown  in  Table  3.4. 


Table  3.4  Single  satellite  model  rewards. 


Current  State 

Reward 

Definition 

si 

A(si,au) 

0 

c(si,ai,2) 

Csat 

s2 

rt(s2,a  2;i) 

Cfiold 

n(s2,a2, 2) 

C launch 

f t(s2,  a2,3) 

C launch 

s3 

n(s3,a3, 1) 

C penalty 

r t(s3,  a3,2) 

Csat  ^penalty 

s4 

d(s4,  a4)4) 

C penalty  Chold 

r t(s4,  a4)2) 

C launch  C penalty 

r t(s4,  a4)3) 

C launch  Csat  C penalty 

The  minimum  expected  total  cost  u*t{-)  is  defined  for  state  st  by 


u*(st )  =  max  Pt(j\st,  a)u*t+1(su  a,j )  +  rt(st,  a) 


(3.30) 


where  the  cost  are  defined  with  negative  values  and  the  boundary  value 


Un(sn)  —  f  n(sn). 


(3.31) 


3.4  Multi-unit  Model 

This  section  presents  the  formulation  of  the  multi-unit  model.  A  system  of 
multiple  satellites  is  by  definition  a  constellation.  Here  M,  the  number  of  satellites  in 
the  system,  is  two  or  more,  (M  >2).  As  presented  in  Section  3.1.1,  it  is  necessary  to 
define  decision  epochs,  periods,  states,  action  sets,  rewards,  transition  probabilities, 
decision  rules,  and  policies.  This  model  has  a  finite  time  horizon  with  the  decision 
epochs  occurring  at  the  beginning  of  each  time  period,  so  that  T  =  {1,2, 
where  N  <  00. 
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The  states  of  the  model  can  be  defined  by  making  use  of  following  observa¬ 
tions.  Each  state  is  made  up  of  two  pieces  of  information;  the  satellites  which  are 
operational  and  the  number  of  spare  satellites  currently  available.  Thus  the  system 
is  a  stochastic  process  {(Xt,St)  :  t  G  T}  where  Xt  represents  which  of  the  satellites 
are  operational  at  decision  epoch  t  and  St  represents  the  number  of  spare  satellites 
available  at  decision  epoch  t.  For  example,  the  case  (X2  —  5,  S2  —  3)  means  at 
decision  epoch  t  =  2,  X2  =  5  might  mean  for  example  that  satellites  1,  2,  and,  4  are 
operational  and  S2  =  3  means  three  spare  satellites  are  available. 

For  any  multi-unit  system  there  is  a  set  of  cases  representing  each  possible 
combination  of  satellites  working  and  not  working,  the  Xt’s.  This  set  includes  cases 
ranging  from  all  of  the  satellites  working  to  none  of  the  satellites  working.  For 
this  model  it  is  important  to  distinguish  between  individual  satellites  so  that  the 
case  with  satellites  1  and  2  working  is  distinct  from  the  case  with  satellites  1  and  3 
working.  Each  of  these  cases  represent  distinct  states  of  the  system.  Each  possible 
value  of  Xt  is  paired  with  each  possible  value  of  St  (the  number  of  spare  satellites 
available)  in  order  to  form  the  states  of  the  model.  If  the  maximum  number  of  spares 
satellites  allowed  in  the  system  is  the  same  as  the  number  of  satellites  in  the  system, 
M,  then  there  are  M  +  1  states  corresponding  to  each  value  of  Xn.  There  are  M  +  1 
states  because  there  is  a  state  representing  each  possible  value  of  St,  0, 1, ... ,  M. 

The  total  number  of  states  in  the  model  is  clearly  dependent  on  M.  With 
M  >  2  the  total  number  of  states  in  a  system  with  M  satellites  can  be  found  using 
the  equation 

{Emq}  x  (M  +  l)  (3.32) 

where  represents  the  number  of  combinations  on  M  objects  taken  j  at  a  time. 
Combinatorial  growth  of  the  state  space  is  clearly  present. 

Actions  for  the  model  are  dependent  on  the  state  of  the  system  and  are  defined 
by  applying  the  following  rules.  A  set  of  actions  is  determined  for  each  system 
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state  by  comparing  the  current  state  of  the  system  to  the  system  state  that  has  M 
operational  satellites  and  a  full  complement  of  M  spare  satellites.  Actions  range  from 
maintaining  the  current  system  status  to  moving  the  system  towards  the  most  robust 
system  state  (e.g.  the  state  with  M  operational  satellites  and  M  spare  satellites). 

The  simplest  and  most  basic  action  is  to  allow  the  system  to  continue  running 
with  intervention.  The  next  type  of  action  to  consider  is  the  purchase  of  spare  satel¬ 
lites  until  there  are  M  spares  available.  For  example,  if  there  are  currently  k  satellites 
on  hand,  there  are  M  —  k  actions  corresponding  to  purchase  of  spare  satellites  in  the 
quantities  1,2 , . . . ,  M  —  k.  The  final  class  of  actions  to  consider  is  what  to  do  with 
spare  satellites  currently  on  hand.  These  satellites  can  be  used  to  replace  any  of  the 
satellites  in  the  model.  There  are  actions  representing  every  possible  combination  of 
replacement  using  the  replacement  satellites  that  are  currently  available.  Whether 
specific  satellites  are  operational  or  failed  is  not  taken  into  account  when  defining 
the  actions.  Each  possible  action  that  follow  the  set  of  rules  is  enumerated.  The 
algorithm  will  select  which  actions  are  necessary  to  minimize  the  expected  total  cost. 

The  number  of  actions  for  a  state  with  k  replacement  satellites  available  at  the 
beginning  of  the  time  period  is  determined  using  the  following  equations: 

M  —  k  +  1,  M  >2,  k  =  0,  (3.33) 

and 

l^2(MCj)x(M-k  +  l  +  j)^+M-k  +  l,  M>  2,  k  >  1.  (3.34) 

The  transition  probabilities  for  the  multi-unit  model  use  the  same  definitions 
presented  as  Section  3.3  with  an  extended  subscript  denoting  which  satellite  is  being 
referred  to  (e.g.  Rsati  and  Psm )•  Extending  the  subscript  for  the  interval  reliability 
allows  the  different  satellites  to  have  different  lifetimes.  Being  able  to  allow  different 
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satellite  lifetimes  is  useful  because  upgrades  are  often  made  to  replacement  satellites 
during  their  construction.  Extending  the  subscript  for  the  probability  of  a  successful 
launch  is  also  useful  because  launching  satellites  into  different  orbits  causes  different 
stresses  during  the  launch.  These  variations  can  lead  to  different  launch  reliabilities. 

The  reliability  parameters  for  the  multi-unit  problem  are  computed  in  the  same 
manner  as  those  of  the  single-unit  model,  ffowever,  the  transition  probabilities  for 
the  multi-unit  problem  are  much  more  complex  than  those  of  the  single-unit  problem 
due  to  the  possibility  of  multiple  events  leading  to  the  transition  from  some  state  sc 
to  some  state  Sd-  To  compute  the  transition  probabilities,  each  event  that  can  lead  to 
a  transition  from  state  sc  to  state  Sd  when  action  acn  is  chosen  must  be  determined. 
The  transition  probability,  Pt(sd\sc,  ac,n),  is  found  by  summing  the  probabilities  of 
each  of  these  events  given  they  are  independent.  For  example,  assume  there  is  a 
constellation  consisting  of  two  satellites,  A  and  B ,  both  of  which  are  operational.  If 
the  decision  is  made  to  replace  satellite  A  there  are  two  possible  events  that  can  take 
place  and  system  still  have  both  satellites  operational  at  the  next  decision  epoch. 
First,  if  there  is  a  successful  launch  that  replaces  satellite  A  and  satellite  B  continues 
to  function  the  both  satellites  will  be  operational.  Secondly,  if  the  launch  to  replace 
satellite  A  is  unsuccessful,  but  satellite  the  original  satellite  A  continues  to  function, 
as  does  satellite  B,  then  both  satellites  will  still  be  considered  operational.  Thus, 
the  probability  of  each  event  occurring  must  be  determined.  Because  the  events  are 
independent,  the  probability  of  each  event  is  summed  to  determine  the  transition 
probability  from  the  state  with  both  satellites  operational,  back  to  the  state  when 
both  satellites  are  operational,  when  the  decision  is  replace  satellite  A. 

Determining  the  rewards  for  the  multi-unit  model  is  straight  forward.  The 
rewards  for  the  multi-unit  system  are  actually  costs  and  use  the  same  cost  defini¬ 
tions  as  used  by  the  single-unit  model.  For  decisions  resulting  in  the  purchase  of  a 
satellite  the  cost  of  a  satellite,  csat,  is  assessed.  This  cost  is  assessed  for  each  satellite 
purchased  at  that  decision  epoch  or  as  a  result  of  that  action.  Holding  costs,  Choid, 


3-27 


are  assessed  whenever  there  is  a  satellite  available  for  replacement  at  the  beginning 
of  the  time  period  (i.e.  at  the  decision  epoch).  This  charge  is  assessed  for  each 
satellite  in  holding  status.  Launch  costs,  Ciaunch,  are  assessed  for  each  satellite  that 
is  launched  during  the  time  period.  Penalty  costs,  cpenaity ,  are  assessed  for  each 
satellite  that  is  not  operational  at  the  beginning  of  the  period. 

The  minimum  expected  total  cost  u%(-)  is  defined  for  state  st  by 


u*(st)  =  max 

a£As. 


2>0K  a)ut+i(su  a,j)  +  rt(st,  a) 

jeS 


(3.35) 


where  the  costs  are  defined  with  negative  values  and  the  boundary  value 


uN(sN)  =  rN(sN).  (3.36) 

This  chapter  presented  a  review  of  Markov  decision  processes.  The  review 
covered  the  components  required  to  formulate  a  Markov  decision  process,  optimality 
criteria,  and  solution  methods  for  determining  optimal  policies.  Following  the  review, 
the  chapter  discusses  the  assumptions  made  about  satellite  constellations  in  order  to 
solve  the  problem  using  Markov  decision  processes.  After  addressing  these  concerns, 
models  for  a  single-unit  problem  and  a  multi-unit  problem  are  given.  In  Chapter 
4,  numerical  examples  using  notional  data  are  given  for  both  the  single-unit  and 
multi-unit  problems.  Sensitivity  analysis  of  the  notional  values  for  satellite  lifetime 
and  costs  follows  the  examples. 
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4.  Numerical  Results  and  Analysis 

This  chapter  presents  numerical  results  and  analysis  of  the  models  using  no¬ 
tional  data.  The  numerical  results  are  based  on  the  single-unit  model  presented  in 
Section  3.3  and  the  multi-unit  model  presented  in  Section  3.4.  Moreover,  a  sensitiv¬ 
ity  analysis  of  model  parameters,  such  as  the  mean  satellite  lifetimes  and  the  penalty 
costs,  is  provided. 

4-1  Single-unit  Example 

A  single-unit  example  is  first  provided  to  demonstrate  the  means  by  which 
to  find  an  optimal  replacement  policy  using  Markov  decision  processes.  The  same 
techniques  used  to  find  a  solution  for  the  single-unit  problem  are  then  applied  to  a 
more  complicated  multi-unit  problem. 

The  single-unit  example  uses  three  months  (or  one  quarter)  of  the  fiscal  year 
as  the  time  periods.  The  decision  epochs  are  defined  to  be  T  =  1,2, . . . ,  N  where 
N  =  40  quarters  which  represents  a  10-year  time  horizon.  The  states  for  this  model 
are  that  the  single  satellite  is  operating  or  failed  and  whether  a  spare  replacement 
satellite  is  available.  These  are  identical  to  the  states  given  in  Table  3.1  and  are 
reproduced  in  Table  4.1  for  convenience. 


Table  4.1  Single  satellite  model  state  definitions. 


State 

Definition 

-Sl 

the  satellite  is  working  and  no  replacement  satellite  is  available 

•S'2 

the  satellite  is  working  and  a  replacement  satellite  is  available 

the  satellite  is  not  working  and  a  replacement  satellite  is  not  available 

•S  4 

the  satellite  is  not  working  and  a  replacement  satellite  is  available 

The  actions  for  the  single-unit  model  relate  to  replacing  the  satellite,  ordering 
a  new  replacement,  or  allowing  the  system  to  run  without  intervention.  The  actions 
for  the  model  were  given  in  Table  3.2  and  are  reproduced  in  Table  4.2  for  convenience. 
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Table  4.2  Single  satellite  model  action  definitions. 


Action  Set 

Action 

Definition 

A, 

Gq,i 

Let  the  system  run  without  intervention 

«1,2 

Order  a  replacement  satellite 

A-2 

0-2,1 

Let  the  system  run  without  intervention 

02,2 

Replace  the  satellite  with  the  available  spare 

a2,3 

Replace  the  satellite  and  order  a  new  replacement 

A3 

a3,l 

Let  the  system  run  without  intervention 

a3,2 

Order  a  replacement  satellite 

a4 

Oa,1 

Let  the  system  run  without  intervention 

Oa,2 

Replace  the  satellite  with  the  available  spare 

04,3 

Replace  the  satellite  and  order  a  new  replacement 

A  few  reliability  parameters  must  first  be  determined  in  order  to  specify  the 
transition  probabilities  for  the  model.  The  values  chosen  in  this  thesis  are  of  a 
notional  nature,  but  are  selected  to  be  representative  of  real  world  systems  where 
possible.  Because  satellite  lifetimes  have  been  assumed  to  be  exponentially  dis¬ 
tributed  only  one  parameter,  the  mean  lifetime  of  the  satellite,  must  be  specified. 
Jane’s  Space  Directory  [5],  page  573,  reports  that  GPS  Navstar  Block  2R  satellites 
have  a  design  life  of  10  years.  The  GPS  Block  2R  design  life  is  the  notional  value 
assumed  for  mean  satellite  lifetime  in  this  example.  The  mean  satellite  lifetime  is 
represented  by  A-1  where  A-1  =  10  years  or  40  quarters.  After  A  has  been  specified 
the  interval  reliability  can  be  determined.  As  shown  in  Proposition  3.1,  the  interval 
reliability  is  given  by  e~XAt  where  At  is  the  amount  of  time  included  in  the  interval. 
In  this  case,  At  amounts  to  three  months  or  one  time  period,  therefore  At  =  1. 

The  final  probability  that  needs  to  be  specified  is  the  probability  of  a  successful 
launch.  Recall  from  Section  3.3  that  the  probability  of  a  successful  launch  is  defined 
as  the  probability  of  a  satellite  being  launched  and  becoming  operational.  Jane’s 
Space  Directory  [5],  page  263,  states  that  the  Delta  2  rocket,  the  primary  launch 
vehicle  of  the  GPS  satellite,  has  a  vehicle  success  rate  of  99  percent.  A  notional 
value  of  95  percent  is  used  for  the  probability  of  a  successful  launch.  This  value  is 
determined  by  considering  the  joint  event  that  the  launch  is  successful  and  that  the 
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satellite  does  not  fail  during  the  checkout  phase.  These  reliability  parameters  are 
summarized  in  Table  4.3. 


Table  4.3  Single  satellite  model  reliability  parameters. 


Reliability 

Notation 

Numerical  Value 

Mean  lifetime  of  the  satellite 

A-1 

40.0000 

Interval  reliability  of  the  satellite 

Rsat 

0.9753 

Probability  of  a  successful  launch 

Psl 

0.9500 

Once  the  reliability  parameters  have  been  specified  the  transition  probabilities 
can  be  determined.  Table  4.4  lists  the  definitions  of  the  transition  probabilities  and 
their  values. 


Table  4.4  Single  satellite  model  transition  probabilities. 


Current  State 

Probability 

Definition 

Value 

Pt(si|si,  Ol,l) 

F^sat 

0.9753 

Pt(s3  si,ai,i) 

1  Rsat 

0.0247 

Pt(s2  |si,aij2) 

Rsat 

0.9753 

Pt.(s4  Si,  01,2) 

1  Rsat 

0.0247 

s2 

Pt(s2  |s2,a2,i) 

Rsat 

0.9753 

Pt(s4|s 2,  0>2, 1) 

1  P-sat 

0.0247 

Pf(si|s2,a2,2) 

Psl 

0.9500 

Pt(s 3  s2,  a2,2) 

1  -Psl 

0.0500 

P«(s2|s2,  a2,3) 

Psl  +  Rsat  x  (1  —  Psl) 

0.9988 

P«(s4  s2,  a2j3) 

(1  -  Rsat )  X  (1  -  Psi ) 

0.0012 

S3 

Pt(s3  s3,  a3ji) 

1 

1.0000 

Pi(s4|s3, 03,2) 

1 

1.0000 

s4 

Pt  (s4 1  s4,  a4,i) 

1 

1.0000 

P«(si|s4,  a4j2) 

Psl 

0.9500 

Pl(s3|s4,  fl'4i2) 

1  _  Psl 

0.0500 

Pt  (s2 1 S4,  04^3) 

Psl 

0.9500 

Pt  (s4 1 S4,  04^3) 

1  _  Psl 

0.0500 

Before  the  total  reward  for  each  action  can  be  determined  some  basic  reward 
values  must  first  be  specified.  The  cost  of  satellites  varies  greatly  depending  on 
the  mission.  Table  4.5  is  produced  from  a  table  presented  by  Apgar,  Bearden,  and 
Wong  [2]  who  list  the  average  unit  cost  in  fiscal  year  2000  dollars  for  various  types 
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of  satellites.  The  cost  of  a  GPS  (Block  2)  satellite  is  given  to  be  57  million  fiscal 
year  2000  dollars  in  Table  4.5.  The  notional  value  of  50  million  dollars  is  used  as  a 
representative  value  for  the  cost  of  a  satellite. 


Table  4.5  Satellite  cost  in  FY00  $M  (from  Apgar,  Bearden,  and  Wong.) 


Mission 

Satellite 

Average  Unit  Costs  (FY00  $M) 

Communications 

Intelsat  VIII 

133 

TDRSS 

126 

DSCS  IIIB 

114 

Navigation 

GPS  (Block  2) 

57 

Missile  Warning 

DSP 

314 

Weather 

GOES 

84 

DMSP 

88 

According  to  Jane ’s  Space  Directory  [5],  page  573,  the  Government  Accounting 
Office  (GAO)  estimated  in  1990  that  it  would  cost  $200,  000  annually  for  each  GPS 
satellite  kept  in  storage.  This  value  is  used  for  as  the  nominal  value  for  holding  cost 
will  assumed  to  $50,  000  per  quarter.  Sensitivity  analysis  of  this  value  will  be  studied 
to  determine  its  affect  on  the  minimum  expected  total  cost. 

As  stated  in  Section  3.2,  a  Delta  2  rocket  costs  from  50  to  60  million  dollars. 
The  nominal  value  for  the  the  cost  of  launch  used  in  the  example  is  55  million  dollars. 
The  nominal  value  for  the  penalty  cost  is  chosen  to  be  50  million  dollars.  Given  this 
is  a  notional  value,  sensitivity  analysis  will  be  performed  in  the  following  section. 
The  basic  reward  (cost)  values  are  summarized  in  Table  4.6. 

Table  4.6  Single  satellite  model  costs. 


Cost 

Value 

Csat 

$50,000,000 

Cjiold 

$50,000 

C launch 

$55,000,000 

C penalty 

$50,000,000 
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Once  the  basic  reward  values  have  been  specified,  the  reward  values  for  each 
action  can  be  determined.  The  total  reward  for  each  model  action  are  given  in  Table 
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Table  4.7  Single  satellite  model  reward  values. 


Current  State 

Reward 

Definition 

Value 

si 

0 

$0 

n(si,aii2) 

Csat 

$50,000,000 

s2 

rt(s2,a2,  i) 

Chold 

$50,000 

n(s2,a2i2) 

C launch 

$55,000,000 

n(s2,a2)3) 

C launch  Csat 

$105,000,000 

s3 

n(s3,a3i  i) 

C penalty 

$50,000,000 

n(s3,a3>2) 

Csat  C penalty 

$100,000,000 

s4 

r f(s4,  a4)i) 

C penalty  C hold 

$50,050,000 

d(s4,  a4i2) 

C penalty  C launch 

$105,000,000 

rt(s4,  a4)3) 

C penalty  H-  C launch  H-  Csat 

$155,000,000 

After  specifying  all  of  the  reliability,  cost,  and  reward  parameters  the  prob¬ 
lem  can  be  solved  using  the  Backward  Induction  Algorithm  presented  of  Section 
3.1.3.  The  algorithm  was  programmed  using  the  mathematical  computing  package 
MATLAB®.  Thirty  runs  of  the  model  were  made  to  assess  the  run  time  charac¬ 
teristics.  Relevant  statistics  for  are  listed  in  Table  4.8.  The  runs  were  conducted 
on  a  Dell®  Inspiron  8100  with  a  1  gigahertz  Intel®  Pentium®  III  processor,  256 
megabytes  of  RAM,  and  using  the  Microsoft®  Windows  2000  Professional  operating 
system. 


Table  4.8  Single-unit  run  time  characteristics. 


Execution  Statistic 

Time  (seconds) 

Mean 

0.1142 

Mode 

0.1100 

Minimum 

0.1100 

Maximum 

0.1210 

Standard  Deviation 

0.0050 
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The  policy  derived  from  this  algorithm  is  shown  in  Table  4.9.  The  table  lists 
which  action  should  be  selected,  depending  on  what  state  the  system  is  in  during 
time  epochs  1, . . .  ,40,  in  order  to  minimize  the  expected  total  cost.  For  example,  if 
the  system  is  in  state  s4,  during  any  of  the  first  19  time  epochs,  action  A4j 3  (replace 
the  satellite  and  order  a  new  replacement)  should  be  chosen.  If  the  system  is  in  state 
s4  during  the  time  epochs  20  through  38,  then  action  A42  (replace  the  satellite  with 
the  available  spare)  should  be  chosen.  If  in  state  s4  during  time  epoch  39,  action 
A4i  1  (let  the  system  run  without  intervention)  should  be  chosen.  No  decisions  are 
made  in  time  epoch  40  as  this  is  the  last  epoch  of  the  time  horizon.  After  a  decision 
is  made  it  is  implemented  at  the  beginning  of  the  next  time  period.  There  is  no  time 
following  decision  epoch  40  for  this  example,  so  a  decision  made  at  time  40  would 
not  be  implemented. 


Table  4.9  Single  satellite  policies. 


Epoch 

si 

s2 

s3 

s4 

Epoch 

Sl 

s2 

s3 

s4 

1 

2 

1 

2 

3 

21 

1 

1 

2 

2 

2 

2 

1 

2 

3 

22 

1 

1 

2 

2 

3 

2 

1 

2 

3 

23 

1 

1 

2 

2 

4 

2 

1 

2 

3 

24 

1 

1 

2 

2 

5 

2 

1 

2 

3 

25 

1 

1 

2 

2 

6 

2 

1 

2 

3 

26 

1 

1 

2 

2 

7 

2 

1 

2 

3 

27 

1 

1 

2 

2 

8 

2 

1 

2 

3 

28 

1 

1 

2 

2 

9 

2 

1 

2 

3 

29 

1 

1 

2 

2 

10 

2 

1 

2 

3 

30 

1 

1 

2 

2 

11 

2 

1 

2 

3 

31 

1 

1 

2 

2 

12 

2 

1 

2 

3 

32 

1 

1 

2 

2 

13 

2 

1 

2 

3 

33 

1 

1 

2 

2 

14 

2 

1 

2 

3 

34 

1 

1 

2 

2 

15 

2 

1 

2 

3 

35 

1 

1 

2 

2 

16 

2 

1 

2 

3 

36 

1 

1 

2 

2 

17 

2 

1 

2 

3 

37 

1 

1 

2 

2 

18 

2 

1 

2 

3 

38 

1 

1 

2 

2 

19 

1 

1 

2 

3 

39 

1 

1 

2 

1 

20 

1 

1 

2 

2 

40 

0 

0 

0 

0 
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Table  4.10  shows  the  minimum  expected  cost  for  the  following  the  policy  of 
Table  4.9.  Note  that  the  minimum  expected  cost  is  dependent  upon  the  initial  state 
of  the  system. 


Table  4.10  Single  satellite  minimum  expected  cost. 


Initial  State 

Value  ($M) 

Sl 

243.358 

s2 

188.408 

S3 

553.442 

S4 

403.810 

f.2  Analysis  of  the  Single-unit  Example 

For  the  notional  example  presented  the  rewards  (costs)  are  represented  as  no¬ 
tional  values.  When  evaluating  an  actual  constellation  the  rewards  (costs)  are  esti¬ 
mates  of  the  actual  values.  Because  the  values  of  the  rewards  are  approximate,  it  is 
desirable  to  investigate  the  impact  of  these  values  on  the  minimum  expected  total 
cost.  This  sensitivity  analysis  is  performed  by  varying  a  parameter,  such  as  the  mean 
satellite  lifetime  or  the  cost  of  a  launch,  and  plotting  the  minimum  expected  total 
cost  for  each  of  the  parameter  values  evaluated.  The  plots  for  a  single  parameter 
show  a  curve  for  each  possible  initial  system  state.  As  in  Table  4.10,  the  minimum 
expected  total  cost  is  dependent  on  the  initial  state. 

The  first  parameter  evaluated  is  the  mean  satellite  lifetime.  Recall  from  Table 
4.3  that  a  mean  satellite  lifetime  of  40  quarters  was  assumed.  In  Figure  4.1  the  mean 
satellite  lifetime  is  varied  from  1  quarter  (3  months)  to  80  quarters  (20  years).  The 
graph  shows  that  as  the  mean  satellite  lifetime  increases,  the  minimum  expected 
total  cost  decreases,  as  expected.  It  also  shows  that  as  the  mean  satellite  lifetime 
approaches  80  quarters,  the  decrease  in  the  minimum  expected  total  cost  “flattens 
out.”  For  this  example,  if  mean  satellite  lifetimes  are  significantly  shorter  than  the 
10-year  design  life,  the  minimum  expected  total  cost  would  increase  dramatically. 
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However,  if  the  mean  satellite  lifetimes  are  longer  than  the  10-year  design  life,  the 
minimum  expected  total  cost  will  decrease,  but  not  significantly.  For  example,  when 
the  mean  satellite  lifetime  is  24  quarters  (6  years)  or  longer,  the  rate  of  decrease  of 
the  minimum  expected  total  cost  is  less  than  10  million  dollars  for  each  additional 
quarter  of  satellite  mean  lifetime.  The  the  mean  satellite  lifetime  is  40  quarters  (10 
years  -  the  satellite  design  life)  the  rate  of  decrease  of  the  minimum  expected  total 
cost  is  approximately  three-and-a-half  million  dollars  for  each  additional  quarter  of 
satellite  mean  lifetime.  The  real-world  trend  has  been  for  satellites  to  last  for  longer 
than  their  design  life.  Such  analysis  may  also  be  useful  in  establishing  the  design 
reliability  of  a  system  when  it  is  being  planned. 


Figure  4.1  Varying  the  mean  satellite  lifetime  over  a  10-year  time  horizon. 

Another  parameter  of  interest  is  the  penalty  cost.  Figure  4.2  shows  how  varying 
the  penalty  cost  from  zero  to  100  million  dollars  per  quarter  affects  the  minimum 
total  expected  cost  for  a  10-year  time  horizon.  The  graph  shows  a  very  low  minimum 
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expected  total  cost  when  the  penalty  cost  is  close  to  zero  because  a  very  low  penalty 
cost  implies  there  is  a  very  low  need  or  desire  to  maintain  the  satellite. 


Figure  4.2  Varying  the  penalty  cost  over  a  10-year  time  horizon. 

When  evaluating  the  case  with  a  very  low  penalty  cost  the  algorithm  determines 
it  is  cheaper  to  pay  the  penalty  cost  than  to  maintain  the  satellite.  The  leftmost 
points  of  the  plot  in  Figure  4.3  illustrate  this  point.  This  graph  shows  the  minimum 
expected  total  cost,  as  well  as  its  individual  components,  the  expected  satellite  cost, 
the  expected  launch  cost,  the  expected  holding  cost,  and  the  expected  penalty  cost 
when  following  the  optimal  policy.  To  aide  in  reading  the  graph,  these  costs  one 
are  displayed  for  only  state.  State  3,  the  case  with  no  operational  satellites  and 
no  spares  available,  is  chosen  because  this  case  is  equivalent  to  populating  a  new 
constellation.  When  the  penalty  cost  ranges  from  zero  through  four  million  dollars, 
the  expected  launch  cost  is  zero.  When  the  penalty  cost  is  five  million  dollars, 
the  expected  launch  cost  jumps  to  61.792  million  dollars  and  the  expected  penalty 
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cost  drops  to  65.354  million  dollars.  With  a  penalty  cost  of  four  million  dollars  the 
expected  penalty  cost  is  156  million  dollars.  For  this  example,  once  the  penalty  cost 
reaches  a  relatively  small  value  (5  million  dollars)  compared  to  the  cost  of  buying 
and  launching  a  satellite  (105  million  dollars  combined),  the  algorithm  determines 
it  is  less  expensive  to  maintain  the  satellite  than  to  pay  the  penalty  costs.  When 
evaluating  a  new  satellite  constellation,  if  the  penalty  cost  is  so  low  that  it  is  cheaper 
to  pay  the  penalty  than  to  maintain  the  constellation,  then  the  constellation  may 
not  be  needed.  Such  a  case  suggests  evaluating  different  alternatives  for  achieving 
the  mission,  such  as  accomplishing  the  mission  by  putting  a  secondary  payload  on 
some  other  constellation. 


Figure  4.3  Costs  over  a  ten  year  time  horizon  for  the  single- unit  example. 

As  the  the  penalty  cost  rises  the  algorithm  must  balance  the  cost  of  maintaining 
the  satellite  and  paying  the  penalty  cost  for  not  being  able  to  perform  the  mission. 
The  minimum  expected  total  cost  rises  as  the  penalty  cost  rises  for  two  reasons. 
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First,  whenever  a  penalty  must  be  paid  the  cost  is  higher.  A  second  reason  is  that 
more  money  will  be  spent  in  buying,  holding,  and  launching  satellites  in  order  to 
prevent  paying  the  higher  penalty  cost  than  otherwise  would  have  been  spent  on 
maintaining  the  satellite  if  the  penalty  cost  was  lower. 

When  the  penalty  cost  changes  from  42  to  43  million  dollars  the  graph  of 
Figure  4.3  shows  a  significant  decrease  in  the  expected  penalty  cost  and  significant 
increase  in  the  expected  cost  of  satellites.  With  penalty  cost  of  42  million  dollars  or 
less  the  expected  holding  cost  is  zero  dollars,  which  implies  there  no  satellites  should 
be  held  in  storage  for  this  range  of  penalty  cost  to  have  the  highest  probability  of 
achieving  the  minimum  expected  total  cost.  When  the  penalty  cost  is  43  million 
dollars  the  holding  cost  increase  to  1.162  million  dollars  indicating  that  satellites 
should  be  held  in  storage  to  achieve  the  minimum  expected  total  cost.  The  expected 
cost  of  satellites  increases  at  this  point  because  the  satellite  being  held  in  storage 
must  now  be  purchased.  The  expected  penalty  cost  drops  at  this  point  because  if 
the  satellite  fails,  a  replacement  can  be  launched  in  the  next  time  period.  To  replace 
a  failed  satellite  if  there  is  no  satellite  in  storage,  a  replacement  must  be  ordered  in 
one  time  period  and  launched  in  the  next. 

Figure  4.4  graphs  the  minimum  expected  total  cost  when  the  launch  cost  is 
varied  from  zero  to  150  million  dollars.  The  graph  shows  that  as  the  launch  cost 
increase  so  does  the  minimum  total  expected  cost  of  maintaining  the  satellite,  as 
expected.  The  costs  increase  linearly  and  straight-forward  to  interpret.  A  graph  of 
the  minimum  expected  total  cost  when  varying  the  cost  of  a  satellite  also  increases 
linearly.  Because  of  the  uncomplicated  nature  of  this  graph  it  is  not  presented  here. 

The  minimum  total  expected  cost  when  varying  the  holding  cost  from  zero  to 
100  million  dollars  is  shown  in  Figure  4.5.  The  graph  shows  that  for  this  example, 
the  holding  cost  has  almost  no  effect  the  minimum  total  expected  cost.  The  holding 
cost  is  insignificant  because  it  is  small  relative  to  the  cost  of  purchasing  and  launch¬ 
ing  a  satellite  as  well  as  the  penalty  cost  for  failing  to  maintain  the  satellite.  By 
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Figure  4.4  Varying  the  launch  cost  over  a  10-year  time  horizon. 

taking  advantage  of  the  ability  to  store  satellites  the  penalty  cost  can  be  minimized. 
In  this  example,  if  the  satellite  failed  and  no  replacement  was  available,  a  replace¬ 
ment  would  have  to  be  ordered  and  then  the  replacement  could  be  launched  in  the 
next  time  period.  This  would  result  in  paying  a  penalty  for  two  time  periods.  If  a 
replacement  satellite  was  available  when  the  satellite  failed  the  replacement  could 
be  launched  immediately  given  the  earlier  assumption  that  launch  vehicles  and  fa¬ 
cilities  are  always  available.  In  this  case,  a  penalty  is  only  paid  for  one  time  period. 
Clearly,  the  size  of  the  penalty  cost  and  the  duration  of  the  penalty  period  effects 
the  minimum  expected  total  cost.  Analysis  of  the  model  allows  a  policy  for  the  use 
of  spare  satellites  for  the  constellation  to  be  determined. 
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Figure  4.5  Varying  the  holding  cost  over  a  10-year  time  horizon. 

4-3  Multi-unit  Example 


The  multi-unit  example  shows  how  the  model  works  for  a  constellation  of 
satellites.  The  decision  epochs  are  defined  to  be  T  =  1,2,  ...,1V  where  IV  =  40 
quarters  which  represents  a  10-year  time  horizon.  The  states  were  described  in 
Section  3.4  where  each  states  specifies  which  satellites  are  operational  and  the  number 
of  spare  satellites  available.  The  states  are  listed  in  Table  4.11  for  the  case  where 
there  are  M  —  3  satellites  in  the  constellation. 

The  action  sets  for  the  three  satellite  constellation  problem  are  defined  using 
the  rules  from  Section  3.4.  Following  those  rules,  states  sharing  an  equal  number 
of  spare  satellites  available  also  share  the  set  of  actions  which  can  be  performed  in 
one  of  those  states.  For  example,  states  si,  S5,  sg,  S13,  S17,  S21,  S25,  and  S29  each 
have  zero  spare  satellites  available  at  the  beginning  of  the  time  period,  and  therefore 
each  states  has  the  same  set  of  available  actions.  The  actions  for  the  three  satellite 
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Table  4.11  Three  satellite  problem  state  definitions. 


State 

Working  Satellites 

Available  Spares 

State 

Working  Satellites 

Available  Spares 

Sl 

1,2,3 

0 

517 

1 

0 

S2 

1,2,3 

1 

Sl8 

1 

1 

S3 

1,2,3 

2 

Sl9 

1 

2 

s4 

1,2,3 

3 

S20 

1 

3 

S5 

1,2 

0 

S21 

2 

0 

S6 

1,2 

1 

S22 

2 

1 

S7 

1,2 

2 

S23 

2 

2 

S8 

1,2 

3 

S24 

2 

3 

S9 

1,3 

0 

S25 

3 

0 

Sio 

1,3 

1 

S26 

3 

1 

$11 

1,3 

2 

S27 

3 

2 

Sl2 

1,3 

3 

S28 

3 

3 

Sl3 

2,3 

0 

S29 

None 

0 

Sl4 

2,3 

1 

S30 

None 

1 

Sl5 

2,3 

2 

S31 

None 

2 

Sl6 

2,3 

3 

S32 

None 

3 

constellation  are  summarized  in  four  tables  (Tables  ??  -  ??)  representing  the  four 
levels  of  spare  satellites  available,  0, 1,  2,  3. 

Table  4.12  lists  the  actions  applicable  to  states  with  no  spare  satellites  available 
at  the  decision  epoch.  The  action  sets  summarized  in  this  table  are  Ai,  A5,  Ag,  A13, 
An,  A2 1,  A2 5,  and  A 2Q.  These  action  sets  contain  only  a  few  actions  because  there 
are  no  spare  satellites  available  with  which  to  perform  replacement  actions. 


Table  4.12  Three  satellite  problem  action  set  for  states  with  no  spares. 


Action 

Definition 

®.s.  1 

Let  the  system  run  without  intervention 

Cl's, 2 

Order  one  spare  satellite 

0's,  3 

Order  two  spare  satellites 

4 

Order  three  spare  satellites 

Table  4.13  lists  the  actions  applicable  to  states  with  one  spare  satellite  available 
at  the  decision  epoch.  The  action  sets  summarized  in  this  table  are  A 2,  Aq,  Aiq,  Au, 
Ais,  A22,  A26,  and  A30.  Note  that  the  replacement  of  each  different  satellite  requires 
a  separate  action. 
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Table  4.13  Three  satellite  problem  action  set  for  states  with  one  spare. 


Action 

Definition 

Os,  1 

Let  the  system  run  without  intervention 

Os,  2 

Order  one  spare  satellite 

Os,  3 

Order  two  spare  satellites 

Os,  4 

Replace  satellite  1 

Os,5 

Replace  satellite  1  and  order  one  spare  satellite 

Os,  6 

Replace  satellite  1  and  order  two  spare  satellites 

as, 7 

Replace  satellite  1  and  order  three  spare  satellites 

0's,  8 

Replace  satellite  2 

0's,  9 

Replace  satellite  2  and  order  one  spare  satellite 

Os, 10 

Replace  satellite  2  and  order  two  spare  satellites 

Os,  11 

Replace  satellite  2  and  order  three  spare  satellites 

Os,  12 

Replace  satellite  3 

as,  13 

Replace  satellite  3  and  order  one  spare  satellite 

Os,  14 

Replace  satellite  3  and  order  two  spare  satellites 

ft's,  15 

Replace  satellite  3  and  order  three  spare  satellites 

Table  4.14  lists  the  actions  applicable  to  states  with  two  spare  satellites  avail¬ 
able  at  the  decision  epoch.  The  action  sets  summarized  in  this  table  are  A3,  A 7,  An, 
Ais,  Aiq,  A23,  A27,  and  A3 1  •  This  set  of  actions  can  call  for  the  replacement  of  no 
satellites,  a  single  satellite,  or  two  satellites.  Each  possible  combination  of  satellite 
replacement  must  be  represented  by  a  separate  action. 

Table  4.15  lists  the  actions  applicable  to  states  with  three  spare  satellites  avail¬ 
able  at  the  decision  epoch.  The  action  sets  summarized  in  this  table  are  A 4,  Ag,  A12, 
Aiq,  A20,  A24,  A28,  and  A32.  This  set  of  actions  allows  for  the  replacement  of  any 
number  of  the  satellites  and  must  have  separate  actions  representing  each  replace¬ 
ment  strategy. 

The  transition  probabilities  are  determined  using  the  definitions  provided  in 
Section  3.4.  For  this  example,  the  reliability  parameters  are  the  same  as  for  the 
single-unit  example  and  are  extended  to  apply  to  all  satellite  in  the  constellation. 
These  reliability  parameters  and  the  reliability  probability  values  are  summarized  in 
Table  4.16  where  Xm,  Rsatm,  and  Psim  refer  to  the  values  specific  to  satellite  m.  For 
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Tabic  4.14  Three  satellite  problem  action  set  for  states  with  two  spares. 


Action 

Definition 

1 

Let  the  system  run  without  intervention 

Os,  2 

Order  one  spare  satellite 

&S,  3 

Replace  satellite  1 

Os,  4 

Replace  satellite  1  and  order  one  spare  satellite 

0's,  5 

Replace  satellite  1  and  order  two  spare  satellites 

^s,6 

Replace  satellite  2 

Os,7 

Replace  satellite  2  and  order  one  spare  satellite 

0's,  8 

Replace  satellite  2  and  order  two  spare  satellites 

9 

Replace  satellite  3 

Os, 10 

Replace  satellite  3  and  order  one  spare  satellite 

11 

Replace  satellite  3  and  order  two  spare  satellites 

Os,  12 

Replace  satellites  1  and  2 

Os, 13 

Replace  satellites  1  and  2  and  order  one  spare  satellite 

Os,  14 

Replace  satellites  1  and  2  and  order  two  spare  satellites 

15 

Replace  satellites  1  and  2  and  order  three  spare  satellites 

Os,  16 

Replace  satellites  1  and  3 

Os, 17 

Replace  satellites  1  and  3  and  order  one  spare  satellite 

Os,  18 

Replace  satellites  1  and  3  and  order  two  spare  satellites 

Os,19 

Replace  satellites  1  and  3  and  order  three  spare  satellites 

Os,20 

Replace  satellites  2  and  3 

Os,  21 

Replace  satellites  2  and  3  and  order  one  spare  satellite 

Os, 22 

Replace  satellites  2  and  3  and  order  two  spare  satellites 

Os,  23 

Replace  satellites  2  and  3  and  order  three  spare  satellites 

this  example  each  satellite  uses  the  same  parameter  and  reliability  values  although 
it  is  possible  for  different  satellites  to  have  different  values. 

The  transition  probabilities  for  the  multi-unit  example  are  computed  as  de¬ 
scribed  in  Section  3.4.  Due  to  complexity  of  the  system  and  the  number  of  state 
transitions,  extreme  care  must  be  taken  to  identify  each  event  that  can  cause  a  tran¬ 
sition  from  a  state  c  to  a  state  d  ( c,d  G  S).  The  probability  of  all  such  events  is 
summed  to  determine  the  transition  probability  from  state  c  to  state  d.  For  this 
example,  the  transition  probabilities  are  too  lengthy  to  be  enumerated  here,  but  an 
example  of  these  transition  probabilities  are  given  in  Table  4.17.  This  table  shows 
the  transition  probabilities  for  choosing  action  03,15. 
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Table  4.15  Three  satellite  problem  action  set  for  states  with  three  spares. 


Action 

Definition 

0'S,  1 

Let  the  system  run  without  intervention 

0's, 2 

Replace  satellite  1 

0's,  3 

Replace  satellite  1  and  order  one  spare  satellite 

4 

Replace  satellite  2 

5 

Replace  satellite  2  and  order  one  spare  satellite 

^s,6 

Replace  satellite  3 

as,7 

Replace  satellite  3  and  order  one  spare  satellite 

0's,  8 

Replace  satellites  1  and  2 

9 

Replace  satellites  1  and  2  and  order  one  spare  satellite 

fls, 10 

Replace  satellites  1  and  2  and  order  two  spare  satellites 

«s,ll 

Replace  satellites  1  and  3 

Os,12 

Replace  satellites  1  and  3  and  order  one  spare  satellite 

Os,  13 

Replace  satellites  1  and  3  and  order  two  spare  satellites 

&S,  14 

Replace  satellites  2  and  3 

15 

Replace  satellites  2  and  3  and  order  one  spare  satellite 

16 

Replace  satellites  2  and  3  and  order  two  spare  satellites 

Os,  17 

Replace  satellites  1,2  and  3 

Os,  18 

Replace  satellites  1,2  and  3  and  order  one  spare  satellite 

19 

Replace  satellites  1,2  and  3  and  order  two  spare  satellites 

Os,  20 

Replace  satellites  1,2  and  3  and  order  three  spare  satellites 

The  multi-unit  problem  uses  the  same  reward  parameters  used  by  the  single¬ 
unit  example,  which  are  given  in  Table  4.6.  The  reward  values  are  also  too  lengthy 
to  be  enumerated  here,  but  an  example  is  given  in  Table  4.18.  This  table  provide 
the  rewards  for  all  of  the  possible  actions  for  state  3. 

This  problem  was  solved  using  the  Backward  Induction  Algorithm  programmed 
in  MATLAB®.  Table  4.19  lists  execution  time  characteristics  for  30  runs  of  the 
program.  The  runs  were  conducted  on  a  Dell®  Inspiron  8100  with  a  1  gigahertz 
Intel®  Pentium®  III  processor,  256  megabytes  of  RAM,  and  using  the  Microsoft® 
Windows  2000  Professional  operating  system. 

The  policy  derived  from  the  algorithm  is  shown  in  Table  4.20.  The  table 
specifies  which  action  should  be  selected  depending  on  the  system  state  during  time 
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Tabic  4.16  Three  satellite  problem  reliability  values. 


Reliability 

Notation 

Numerical  Value 

Mean  lifetime  of  the  satellite 

Ar1 

40 

Mean  lifetime  of  the  satellite 

A21 

40 

Mean  lifetime  of  the  satellite 

aj  1 

40 

Interval  reliability  of  the  satellite 

Rsatl 

0.9753 

Interval  reliability  of  the  satellite 

Rsat2 

0.9753 

Interval  reliability  of  the  satellite 

RsatS 

0.9753 

Probability  of  a  successful  launch 

Psll 

0.9500 

Probability  of  a  successful  launch 

Psl2 

0.9500 

Probability  of  a  successful  launch 

Psl3 

0.9500 

epochs  1, ...  ,40  in  order  to  minimize  the  total  expected  cost.  For  example,  if  the 
system  is  in  state  s3i,  during  any  of  the  first  26  time  epochs,  action  (replace  the 

satellites  1  and  2  and  order  two  new  replacements)  should  be  chosen.  If  the  system 
is  in  state  s3i  during  the  time  epochs  27  through  35,  then  action  A3113  (replace  the 
satellites  1  and  2  and  order  a  new  replacement)  should  be  chosen.  If  the  system  is 
in  state  s3i  during  the  time  epochs  36  through  37,  then  action  A3112  (replace  the 
satellites  1  and  2  and  do  not  order  a  replacement)  should  be  chosen.  If  in  state  s3i 
during  time  epochs  38  or  39,  action  A3  pi  (let  the  system  run  without  intervention) 
should  be  chosen.  No  decisions  are  made  in  time  epoch  40  as  there  are  no  time 
periods  evaluated  after  this  epoch.  Decisions  are  implemented  at  the  beginning  of 
the  next  time  period  and  any  decision  made  at  time  epoch  40  for  this  example  would 
not  be  evaluated. 

Table  4.21  shows  the  minimum  expected  total  cost  for  the  following  the  policy 
of  Table  4.20.  Note  that  the  minimum  expected  cost  is  dependent  upon  the  initial 
state  of  the  system. 

4-4  Analysis  of  the  Multi-unit  Example 

The  analysis  of  the  multi-unit  example  closely  follows  the  analysis  the  of  the 
single-unit  example.  For  satellite  cost,  launch  cost,  and  holding  cost  the  graphs  of 
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Table  4.17  Transition  probabilities  for  action  0345. 


Probability 

Definition 

Value 

Pt(s^  s3,  a3,i5) 

( Psll  x  Psl2  x  Psat3 )  +  ((1  —  Pall)  x  Rsatl  x  P sl2  x  Psat3 ) 
H“(-P s/l  x  (1  P s/2)  x  Psat2  x  Psat3 ) 

H“((l  Psll)  x  (1  P s/2)  x  Psatl  x  Psat2  x  Psat3 ) 

9.7290  x  10”1 

Pt(s$  S3,  0345) 

(P s/1  x  P s/2  x  (1  Psat3 )) 

H“((l  P s/l)  x  Psatl  x  P s/2  x  (1  Psat3)) 

s/1  x  (l  P s/2)  x  Psat2  x  (1  Psat3)) 

+  ((1  —  Psll)  X  (1  Psl2 )  x  Rsatl  X  Psat2  x  (1  —  Rsats)) 

2.4629  x  10~2 

Pt(su  S3,  03,15) 

(Psll  x  (1  P s/2)  x  (l  Psat2)  x  Psat3>) 

+  ((1  —  Psll)  x  (1  —  Psl2)  x  Rsatl  X  (1  —  Psat2)  x  Psat3 ) 

1.2025  x  10~3 

Pt(siQ  S3,  03,15) 

((l  P s/l)  x  (1  Psatl)  x  P sl2  x  Psat3>) 

+  ((1  —  Psll)  X  (1  —  Psl2)  X  (1  —  Rsatl)  X  Psat2  X  Psat3) 

1.2025  x  10”3 

Pt(s20  s3>  a3,15) 

( Pall  X  (1  -  Psn)  X  (1  -  Rsat.2)  X  (1  -  Rsats)) 

+  ((1  —  Psll )  X  (1  —  Rsl/l)  X  Rsatl 

X  (1  Rsat.2)  X  (1  -  Rsats)) 

3.0442  x  10~5 

Pt  (^24 1 53)  03,15) 

((1  -  Psn)  x  (1  -  Rsat  1)  x  Psl2  x  (1  -  Rsat3)) 

+  ((1  —  Psll)  X  (1  —  Pgi 2)  X  (1  —  Rsat\) 

X Rsat2  X  (1  Rsats)) 

3.0442  x  10“5 

Pt(s28  S3,  O345) 

(1  -  P81 1)  x  (1  -  Psi2)  x  (1  -  Rsat  1) 

X  ( 1  Rsat 2)  X  Rsat3 

1.4864  x  10-6 

Pi(s32  S3,  03,15) 

(1  -  P8i  1)  X  (1  -  PsL2)  X  (1  -  Rsat  1) 

X  ( 1  Rsat2)  x  (1  Psat3>) 

3.7628  x  10~8 

the  minimum  expected  total  cost  increase  linearly  and  follow  the  same  patterns  as 
the  graphs  presented  in  Section  4.2.  For  these  reasons,  analysis  of  these  parameters 
is  not  presented  here. 

The  mean  satellite  lifetime  for  the  multi-unit  example  is  also  varied  from  1 
quarter  (3  months)  to  80  quarters  (20  years).  For  the  multi-unit  example,  the  mini¬ 
mum  expected  total  cost  is  the  equal  for  certain  initial  states.  For  example,  all  states 
with  two  operational  satellites  and  no  available  spares  (e.g.  states  s5,  s$,  S13)  all  have 
the  same  minimum  total  expected  cost.  The  optimal  policy  requires  different  actions 
for  each  state,  but  the  cost  are  equivalent.  Figure  4.6  shows  the  minimum  expected 
total  cost  for  12  initial  states.  Figure  4.7  shows  the  minimum  expected  total  cost  for 
12  other  initial  states.  Each  graph  follows  the  same  basic  pattern  so  it  is  sufficient  to 
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Table  4.18  Three  satellite  reward  values  example. 


Reward 

Definition 

Value 

rt{s3,a3j  i) 

2  X  Ch.old 

$100,000 

rt(s3,a3j2) 

2  X  Ch.old  T  Csat 

$50,100,000 

rt{s3,a3>3 ) 

Chold  \  C-launch 

$55,050,000 

T- t(s 3,  03,4) 

Chold  H-  C-launch  Csat 

$105,050,000 

rt(s3,a3}5) 

Chold  Claunch  “1“  2  X  Csai 

$155,050,000 

r t(s3,  a3fi) 

Chold  ^launch 

$55,050,000 

rt(s3,a3J ) 

Chold  ^launch  “I-  Csai 

$105,050,000 

f t(s3,  a3:s) 

Chold  “1“  C-launch  “I-  2  X  Csai 

$155,050,000 

f t(s3,  a3:g) 

Chold  H-  Ciauncfi 

$55,050,000 

r t(s3l  03,10) 

Chold  “1“  CiauncJi  ~\~  Csat 

$105,050,000 

f t(s 3,  °3,ll) 

Chold  Claunch  "f  2  X  Csaf 

$155,050,000 

r t(s 3,  03,12) 

2  X  Ciauncfi 

$110,000,000 

r t(s3,  a3,i3) 

2  X  Ciaunch  +  Csat 

$160,000,000 

r t(s3,  03,14) 

2  x  Ciaunc}i  -\-  2  x  csai 

$210,000,000 

f t(s3l  03,15) 

2  x  Ciaunch  3  x  csai 

$260,000,000 

f t(s3l  03,16) 

2  x  Ciauncfi 

$110,000,000 

f t(s3l  03,17) 

2  x  Ciauncji  -I-  csat 

$160,000,000 

r t(s 3)  °3,18) 

2  x  Ciaunch  T  2  x  csat 

$210,000,000 

r t(s 3;  a3,19) 

2  x  Ciaunch  T  3  x  csat 

$260,000,000 

r t(s 3;  °3,2o) 

2  x  Ciaunch 

$110,000,000 

r t{s3l  03,21) 

2  X  Ciauncji  -\r  Csat 

$160,000,000 

r t{s3l  03,22) 

2  x  Ciaunch  "f  2  x  csai 

$210,000,000 

r t(s3l  03,23) 

2  x  Ciauncji  3  x  csai 

$260,000,000 

present  only  one  graph  in  order  to  illustrate  various  arguments  regarding  the  plot. 
Figures  4.6  and  4.7  show  that,  just  as  in  the  single-unit  case,  as  the  mean  satel¬ 
lite  lifetime  increases,  the  minimum  expected  total  cost  decreases.  It  holds  for  this 
multi-unit  example  that  the  decrease  in  minimum  expected  total  cost  “flattens”  out 
as  the  mean  satellite  lifetime  approaches  80  quarters  or  20  years.  When  the  satellite 
mean  lifetime  reaches  42  quarters  or  10.5  years,  for  each  of  the  32  possible  starting 
states,  the  rate  of  change  for  the  minimum  expected  total  cost  is  less  than  10  million 
dollars  for  every  subsequent  increase  of  one  quarter  to  the  mean  satellite  lifetime. 
For  this  example,  a  10  million  dollar  change  is  less  than  a  2.5  percent  change  to  the 
minimum  expected  total  cost. 
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Table  4.19  Multi- unit  run  time  characteristics. 


Execution  Statistic 

Time  (seconds) 

Mean 

13.7528 

Mode 

13.7600 

Minimum 

13.7190 

Maximum 

13.8500 

Standard  Deviation 

0.0282 

Figure  4.6  Varying  the  mean  satellite  lifetime  over  a  10-year  time  horizon. 

A  graph  of  the  minimum  expected  total  cost  for  a  10-year  time  horizon  as  the 
penalty  cost  is  varied  is  shown  in  Figure  4.8.  The  penalty  cost  is  varied  from  zero 
to  100  million  dollars  per  quarter.  This  plot  only  shows  the  values  for  12  of  the 
initial  states.  Plots  of  the  other  initial  states  follow  the  pattern  of  this  graph.  Just 
as  in  the  single-unit  example,  the  figure  shows  a  very  low  minimum  expected  total 
cost  when  the  penalty  cost  is  close  to  zero.  When  evaluating  this  case,  the  algorithm 
determines  it  is  less  expensive  to  pay  penalties  than  to  purchase  and  launch  satellites 
to  maintain  the  constellation.  However,  once  the  penalty  cost  reaches  a  relatively 
small  value  compared  to  the  cost  of  buying  and  launching  a  satellite,  the  algorithm 
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Table  4.20  Multiple  satellite  policies. 


State 

Policy 

Sl 

2,2, 2, 2, 2,  2,  2,  2,2, 2,2, 2, 2, 2,  2,  2,  2,2,2, 2, 2,  2,  2,  2,2, 2, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,0 

s  2 

1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,0 

S3 

1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,0 

S4 

1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,1, 1,0 

S5 

3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1,1, 1,1,0 

S6 

13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 
12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 1, 1, 0 

S7 

QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ1  1  f) 

C7,  tJ  ,  o',  o',  o',  o',  o',  o',  C7,  C7,  C7  ,  o',  o',  o',  o',  o',  o',  C7,  o',  C7 ,  C7 ,  C7,  o',  C7,  o' ,  o' ,  o',  c/,  o',  C7,  C7 ,  o',  C7 ,  o',  o',  o',  X,  X,  KJ 

S8 

6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 1, 1, 0 

S9 

3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 0 

Sio 

9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 1,1,0 

$11 

6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 1, 1, 0 

Sl2 

4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1, 1,  0 

Sl3 

3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1,1, 1,1,0 

Sl4 

5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1,1,0 

Sl5 

3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1,1,0 

Sl6 

2, 2, 2, 2, 2,  2,  2,  2, 2, 2, 2, 2, 2, 2,  2,  2,  2, 2, 2, 2, 2,  2,  2,  2, 2, 2, 2,  2,  2,  2, 2, 2, 2,  2,  2,  2, 2, 1,1,0 

Sl7 

4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3,  3,  3,  3, 3, 3, 3,  3,  3, 1, 1, 1, 1,  0 

Sl8 

10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 

9, 9, 9, 9, 9, 9, 9, 9, 9, 8, 8, 1,1,0 

Sl9 

21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21,21, 
20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 1, 1, 0 

S20 

14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 
14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 1, 1, 0 

S21 

4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3,  3,  3,  3, 3, 3, 3,  3,  3, 1, 1, 1, 1,  0 

S22 

6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 4, 1,1,0 

S23 

17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 
16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 1, 1, 0 

S24 

11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11,11, 

11,11,11,11,11,11,11,11,11,11,11,1,1,0 

S25 

4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3,  3,  3,  3, 3, 3, 3,  3,  3, 1, 1, 1, 1,  0 

S26 

6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 4, 1,1,0 

S27 

13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 
12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 1, 1, 0 

S28 

8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,1,1,0 

S29 

4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1, 1, 1, 1, 0 

S30 

7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 6, 6, 6, 6, 6, 6, 4, 4, 1,1,0 

S31 

14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 
13, 13, 13, 13, 13, 13, 13, 13, 13, 12, 12, 1, 1, 0 

S32 

18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 
18, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 1, 1, 0 
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Table  4.21  Multiple  satellite  model  minimum  expected  cost. 


Initial  State 

Value  ($M) 

Initial  State 

Value  ($M) 

Sl 

470.025 

sir 

880.282 

s2 

420.075 

Sl8 

784.181 

S3 

374.805 

Sl9 

688.217 

s4 

341.602 

S20 

638.267 

s5 

675.126 

S21 

880.282 

s6 

579.107 

S22 

784.181 

s7 

529.157 

S23 

688.217 

s8 

483.708 

S24 

638.267 

s9 

675.126 

S25 

880.282 

SlO 

579.107 

S26 

784.181 

Sll 

529.157 

S27 

688.217 

Sl2 

483.708 

S28 

638.267 

Sl3 

675.126 

S29 

1085.443 

Sl4 

579.107 

S30 

989.309 

Sl5 

529.157 

S31 

893.265 

Sl6 

483.708 

S32 

797.356 

determines  it  is  more  advantageous  to  maintain  the  satellite  than  to  pay  the  penalty 
costs. 

This  chapter  began  with  an  example  of  a  single-unit  problem  in  order  to  demon¬ 
strate  the  use  of  Markov  decision  processes.  Tables  4.9  and  4.10  showed  the  minimum 
expected  total  cost’s  dependence  on  the  initial  state  of  the  system  for  the  time  hori¬ 
zon  evaluated.  Sensitivity  analysis  of  problem  parameters  demonstrated  the  impact 
these  values  made  on  the  minimum  expected  total  cost.  Next,  a  multi-unit,  or  con¬ 
stellation,  example  was  given.  Similar  results  and  sensitivity  analysis  followed  the 
example.  Chapter  5  presents  a  summary  of  this  thesis,  discusses  insights  gained 
during  the  research,  explains  the  contributions,  and  details  future  work  in  the  area 
of  this  thesis. 
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5.  Conclusions  and  Future  Research 


The  primary  objectives  of  this  thesis  were  to  create  an  analytical  model  of  satel¬ 
lite  constellations,  to  find  optimal  replacement  policies  that  minimize  the  expected 
total  cost  of  maintaining  the  constellations,  and  to  study  how  changes  to  model  pa¬ 
rameters  affect  the  policy  and  the  minimum  expected  total  cost.  The  importance  of 
analytically  modeling  satellite  constellations  via  Markov  decision  processes  is  that 
a  provably  optimal  replacement  policy  can  be  derived.  Knowing  the  optimal  re¬ 
placement  policy  aides  a  decision  maker  in  making  informed  decisions  regarding  the 
maintenance  of  a  constellation.  While  following  the  policy  does  not  guarantee  that 
the  minimum  cost  is  achieved,  this  course  of  action  ideally  has  a  higher  likelihood 
of  achieving  the  minimum  cost  than  any  other  strategy.  Sensitivity  analysis  on  the 
model  parameters  provided  a  way  to  evaluate  how  the  optimal  value  is  affected  by 
changes  to  the  problem  parameters.  If  the  optimal  value  changes  dramatically  with 
small  changes  to  a  parameter,  it  will  be  important  to  ensure  that  accurate  informa¬ 
tion  about  the  parameter  is  included  in  the  model. 

In  order  to  determine  an  optimal  replacement  policy,  an  analytical  model  of 
the  satellite  constellation  was  created  and  solved  using  Markov  decision  processes 
(stochastic  dynamic  programming)  to  find  the  optimal  replacement  policy.  The  third 
chapter  provided  a  review  of  Markov  decision  processes,  stated  the  assumptions  made 
in  modeling  the  satellite  constellations,  and  formulated  models  for  single-unit  and 
multi-unit  systems.  A  discussion  of  linear  programming  formulations  extended  the 
review  of  Markov  decision  processes.  Several  assumptions  about  satellites  and  their 
operation  were  made  in  order  to  model  satellite  constellations  as  Markov  decision 
processes  and  to  avoid  the  problem  of  state  space  explosion.  These  assumptions  are 
generally  mild,  in  that  they  do  not  greatly  modify  the  characteristics  of  many  of 
the  existing  satellite  constellations.  The  single-unit  model  was  formulated  for  the 
purpose  of  demonstrating  Markov  decision  process  modeling  for  a  satellite  constel- 
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lation.  For  the  single-unit  model  it  was  possible  to  explicitly  identify  the  states, 
actions,  transition  probabilities,  and  rewards.  This  aided  the  understanding  of  the 
multi-unit  model.  The  multi-unit  model  was  defined  by  presenting  rules  that  were 
used  to  determine  the  system  states,  actions,  transitions  probabilities,  and  rewards. 
Equations  for  computing  the  number  of  states  and  the  number  of  actions  were  pro¬ 
vided  to  determine  the  size  of  the  state  space  as  a  function  of  the  number  of  satellites 
in  the  constellation. 

Chapter  4  specified  model  parameters  for  both  single-unit  and  multi-unit  ex¬ 
amples.  The  examples  of  each  problem  were  followed  by  sensitivity  analysis  of  model 
parameters.  Model  parameters  were  specified  by  determining  realistic  values  from 
actual  space  systems.  The  multi-unit  example  used  the  same  parameter  values  as 
the  single-unit  example.  The  parameter  values  were  assumed  to  be  identical  for 
satellite  in  the  constellation.  A  constellation  of  three  active  satellites  was  assumed 
for  the  multi-unit  example.  Analysis  of  both  the  single-unit  and  multi-unit  exam¬ 
ples  included  sensitivity  analysis  of  the  model  parameters  and  interpretation  of  these 
analyses. 

The  main  contribution  of  this  research  was  the  development  of  an  analytical 
model  used  to  determine  the  optimal  replacement  policy  for  satellite  constellations. 
This  analytical  model  serves  as  a  framework  upon  which  a  more  detailed  model  can  be 
based.  The  use  of  an  analytical  model  provides  a  provably  optimal  replacement  policy 
for  the  given  assumptions  and  offers  an  alternative  to  potentially  time-consuming 
simulation,  ffigh  resolution  simulation  modeling  is  often  used  to  analyze  various 
policies  for  satellite  constellations.  However,  simulation  cannot  prove  the  optimality 
of  a  given  policy.  Policies  found  by  the  analytical  methodology  of  this  thesis  can 
be  used  to  determine  budgets  for  maintaining  the  constellation.  Because  the  models 
do  not  assume  budgetary  constraints,  the  policies  found  by  using  Markov  decision 
processes  provide  the  minimum  expected  total  cost  over  the  entire  time-horizon  being 
evaluated.  Therefore,  by  considering  the  actions  of  the  optimal  policy  for  a  given 
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time  period  t,  a  realistic  budgetary  value  can  be  derived.  The  sensitivity  analysis 
of  the  model  parameters  can  be  used  to  determine  where  small  changes  to  model 
parameters  could  have  a  large  effect  on  the  minimum  expected  total  cost. 

For  the  purpose  of  analytical  tractability,  several  assumptions  were  employed 
regarding  the  satellites  and  their  operation.  The  model  can  be  extended  by  relaxing 
some  of  these  assumptions.  The  first  important  assumption  to  be  relaxed  is  that 
of  exponentially  distributed  satellite  lifetimes.  One  alternative  is  to  use  phase-type 
(PH)  distributions  to  model  the  satellite  lifetime  distribution.  PH-distributions  can 
be  used  to  represent  general  distributions  by  the  convolution  of  multiple  exponential 
distributions  [1].  Well-known  examples  of  PH-distributions  include  the  Erlang  and 
Coxian  distributions.  The  main  feature  of  PH  distributions  that  make  them  of  value 
in  this  context  is  that  they  maintain  the  Markovian  (memoryless)  property  [24]  and 
are  able  to  approximate  any  probability  distribution.  However,  such  approximations 
come  at  a  computational  expense  due  the  additional  number  of  required  system 
states  for  the  underlying  Markov  chain.  Hence,  accuracy  must  be  balanced  with  this 
expense. 

The  models  assumed  that  a  satellite  is  either  operational  or  non- operational. 
Allowing  the  level  of  the  satellite’s  health  to  be  represented  by  several  states  would 
more  accurately  represent  real-world  scenarios.  The  different  states  can  hypothet¬ 
ically  represent  the  amount  of  remaining  useful  life  for  a  satellite.  Hence,  if  some 
penalty  cost  is  assessed  for  operating  in  a  degraded  state,  the  model  could  ideally 
determine  the  degradation  point  at  which  satellites  should  be  replaced  to  minimize 
the  expected  total  cost  over  the  time-horizon  evaluated.  Increasing  the  number  of 
states  representing  the  condition  of  a  satellite  greatly  increases  not  only  the  state 
space  of  the  entire  problem,  but  also  the  number  of  state  transitions  that  must  be 
determined.  The  computational  power  and  available  memory  are  important  consid¬ 
erations  in  determining  how  many  levels  of  satellite  health  should  be  represented. 
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Allowing  multiple  satellites  to  be  launched  on  a  single  launcher  is  another 
possible  improvement.  However,  the  ability  to  do  this  depends  on  the  satellite  system 
being  evaluated.  For  example,  it  is  unrealistic  to  think  that  multiple  heavy  satellites, 
such  as  Milstar,  could  be  launched  on  a  single  launcher.  Relaxing  this  assumption 
would  be  valid  only  for  certain  satellite  systems.  However,  the  model  is  robust 
enough  to  be  extended  to  such  situations  when  appropriate. 

Modeling  the  capability  to  have  on-orbit-spare  satellites  is  another  possible 
improvement.  There  are  actually  two  distinct  types  of  on-orbit-spares  that  could  be 
modeled.  Hot  spares  would  be  activated  and  ready  to  perform  the  mission  as  soon 
a  the  primary  satellite  could  not  longer  do  so.  A  cold  spare  would  be  in  a  standby 
mode  and  would  need  to  be  activated  in  the  event  that  the  primary  satellite  could 
no  longer  perform  the  mission.  A  hot  spare  would  incur  faster  degradation  than  a 
cold  spare,  but  both  would  incur  less  degradation  than  an  active  satellite.  Including 
the  use  of  on-orbit-spares  would  allow  this  capability  to  be  evaluated  to  determine  if 
it  was  a  cost  effective  measure  to  use.  It  is  likely  that  on-orbit-spares  would  be  used 
when  the  penalty  cost  of  a  mission  failure  is  very  high. 

This  thesis  suggested  that  the  penalty  cost  should  be  determined  through  the 
use  of  a  utility  or  value  function.  Further  work  in  area  of  determining  the  utility 
of  a  satellite  constellation  is  necessary  to  properly  use  this  aspect  of  the  model  (cf. 
[33]).  Furthermore,  the  holding  cost  could  be  represented  as  a  utility  function.  When 
a  satellite  is  built  and  held  in  storage,  advances  in  technology  and  design  are  not 
incorporated  into  the  satellite  prior  to  its  launch  unless  costly  retrofits  are  made. 
These  changes  can  improve  the  capabilities  of  the  satellite  and  extend  it’s  lifetime. 
Therefore,  it  would  be  more  realistic  to  account  for  this  opportunity  cost  in  future 
analyses. 

Another  generalization  of  the  model  is  to  allow  the  lead  time  for  replacement 
orders  to  be  a  random  variable.  This  would  allow  the  production  time  of  a  satellite 
to  span  across  multiple  decision  epochs.  The  production  time  for  some  satellites  is 
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longer  than  three  months.  This  change  would  make  the  models  more  realistic  by 
requiring  that  replacement  satellites  be  ordered  before  the  expected  failure  of  active 
satellites.  This  may  result  in  a  replacement  not  being  available  when  needed,  or  in 
paying  large  holding  costs  for  replacement  satellite  built  well  before  the  time  they 
are  required. 

Another  area  of  study  is  the  availability  of  launchers  and  launch  facilities.  Un¬ 
der  normal  operating  conditions,  a  sufficient  number  of  launchers  and  launch  facili¬ 
ties  are  available  to  launch  satellites  into  orbit  given  proper  planning  and  scheduling. 
These  conditions  are  assumed  to  exist  in  the  proposed  models.  In  a  surge  period, 
or  when  rapid  replenishment  of  a  constellation  is  required,  this  assumption  is  not 
likely  to  hold  true.  The  models  presented  here  are  only  concerned  with  a  single 
constellation.  An  in-depth  study  of  launcher  and  launch  facility  availability  would 
need  to  incorporate  the  priority  given  to  a  particular  constellation  to  determine  the 
availability  of  resources  to  that  constellation. 

A  major  challenge  during  the  course  of  this  thesis  was  determining  the  scope 
of  this  initial  effort.  The  more  detail  incorporated  into  an  analytical  model  increase 
the  state  space  of  the  problem.  Determining  the  level  of  detail  at  which  to  model 
satellite  constellations  required  balancing  the  degree  of  reality  in  model  outputs 
and  computational  complexity  of  finding  results.  In  order  to  show  the  value  of  an 
analytical  modeling  approach  sufficient  problem  assumptions  were  required  to  allow 
the  model  to  be  presented  in  a  clear  manner.  Relaxation  of  these  assumptions  is 
fertile  ground  for  future  research  of  this  problem. 
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associated  with  each  possible  action  is  determined  by  the  number  of  satellites  purchased,  launched,  or  held  in  storage,  as  well  as 
the  operational  capability  of  the  constellation.  The  system  is  evaluated  for  a  given  time  horizon  using  die  standard  Policy 
Evaluation  Algorithm  of  Markov  decision  processes  (stochastic  dynamic  programming)  to  determine  the  optimal  replacement 
policy  and  die  minimum  expected  total  cost.  Example  problems  using  notional  data  are  presented  to  demonstrate  the  solution 
procedures.  Sensitivity  analysis  of  problem  parameters  is  performed  to  investigate  dieir  impact  on  die  minimum  expected  total  cost 

of  operating  the  constellation  over  a  specified  time  horizon. _ 
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